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TITLE: Novel yeast promoters suitable for expression cloning in 
yeast and heterologous expression of proteins in yeast. 

FIELD OF INVENTION 

5 

BACKGROUND OF THE INVENTION 

The advent of recombinant DNA techniques has made it 
possible to select single protein components with interesting 
properties and produce them on a large scale. This represents an 

10 improvement over the previously employed production process 
using micro-organisms isolated from nature and producing a mix- 
ture of proteins which would either be used as such or separated 
after the production step. However, the conventional cloning 
techniques have the drawback that each protein component has to 

15 be purified and characterised by its (partial) amino acid 
sequence before it is possible to prepare synthetic oligonucleo- 
tide probes for hybridisation experiments. Since this is a 
rather time-consuming process, the cloning of novel proteins 
might be considerably expedited by using a screening method in- 

20 volving selecting clones expressing a desired protein activity, 
i.e. expression cloning. 

Recently, a novel method for cloning of fungal enzyme 
genes by expression cloning in yeast was developed by Dalboge 
and Heldt-Hansen (A novel method for efficient expression 

25 cloning of fungal enzyme genes. Mol. Gen. Genet. 243 : 253-260. 
(1994) , WO 93/11249) . 

This expression cloning technique combines the ability 
of a yeast strain (e.g. SaccharomycGS cerevislae) to express 
heterologous genes with the utilisation of sensitive enzyme 

30 plate assays. The principle in expression cloning is outlined in 
figure 1. 

This method makes it possible to clone enzyme genes 
independently of knowledge of the amino acid sequence and has 
proven successful in cloning a number of new enzymes. 
35 Even though the above described method already have 

proven successful, there is still room for improvement. 

Improvement of the expression cloning technique can be 
done by identifying new improved promoters, e.g. to increase 
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expression of naturally low expressed enzymes and thereby 
facilitating the subsequent screening. 

EP 220864 describes a Yarrowia lipolytica yeast promoter 
XPR2 . The XPR2 yeast promoter is only active at pH above 6.0 on 
5 media lacking preferred carbon and nitrogen sources and full 
induction requires high levels of peptone in the culture medium 
(Ogrydziak, D.M. , Demain, A.L.. , and Tannenbaum, S.R. (1977) 
Biochim. Biophys. Acta. 497 : 525-538.; Ogrydziak, D.M. and 
Scharf, S.J. (1982). Gen. Microbiol. 128 : 1225-1234.). 

10 The demand for pH above 6.0 in the medium makes it dif- 

ficult to screen directly for secreted enzymes that are active 
only in an acidic environment. 

Therefore, an object of the present invention, is to 
provide new improved yeast promoters, especially for use in 

15 expression cloning in yeast, but also for heterologous 
expression of a desired polypeptide in an expression system of 
choice. 

SUMMARY OF THE INVENTION 

20 The present invention is based on the cloning and cha- 

racterisation of two DNA sequences shown in SEQ ID NO 1 and 2, 
respectively, which both: 

l)have yeast promoter activity, and 
25 2) have improved properties for expression cloning in 

yeast. 

Further deletion studies on both yeast promoter 
sequences have identified the most important regions for each 
30 yeast promoter. 

For the yeast promoter shown in SEQ ID NO 1, the most 
important region is from position -241 to -41 and for the yeast 
promoter shown in SEQ ID NO 2 , it is from position -163 to -3. 
For further details see example 8. 
35 Accordingly, in a first aspect the invention relates to 

a cloned yeast promoter DNA sequence, which comprises 
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a) the DNA sequence from position -241 to -41 shown in SEQ ID NO 
1 , or 

b) an analogue of the DNA sequence defined in a) which 

i) is at least 90 % homologous with said DNA sequence, 

5 or 

ii) hybridises with the same nucleotide probe as the DNA 
sequence defined in a) . 

In a second aspect the invention relates to a cloned 
10 yeast promoter DNA sequence, which comprises 

a) the DNA sequence from -163 to -3 shown SEQ ID NO 2 , or 

b) an analogue of the DNA sequence defined in a) which 

i) is at least 90 % homologous with said DNA sequence, 

15 or 

ii) hybridises with the same nucleotide probe as DNA 
sequence defined in a) . 

In a further aspect the invention relates to an 
20 expression vector comprising a cloned yeast promoter according 
to the invention. 

In a further aspect the invention relates to the use of 
said expression vector for expression cloning in yeast. 

Further the invention relates to a process for producing 
25 a polypeptide of interest in a yeast host cell, the process com- 
prising transforming a suitable yeast host cell with a 
recombinant expression vector comprising i) a yeast promoter of 
the invention and ii) a DNA sequence coding for a polypeptide of 
interest, culturing the transformed cells under suitable condi- 
30 tions to express the polypeptide, and recovering the expressed 
polypeptide from the culture. 

Finally the invention relates to the use of a 
polypeptide produced as described above for various industrial 
applications. 

35 
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BRIEF DESCRIPTIONS OP DRAWINGS 

Fig. 1: Flow scheme of expression cloning. 

5 Fig. 2: Plasmid used for construction of the genomic libray (A). 

The SauA I digested genomic DNA was cloned at the BamHI 
sites after removal of the kanamycine resistance gene. 
The kanamycine resistance gene is flanked by two 
inverted repeats which spoils the ability of the plasmid 

io to replicate unless separated by an insert. (B) Example 

of an expression vector used for examination of the 
different yeast promoter sequences. All expression 
vectors used contain selection markers and sequences for 
replication in E.coli and Y .lipolytics as in pY3Xl (see 

15 figure 2) . The different yeast promoter sequences were 

cloned as Clal/BamHI fragments and tested in constructs 
in which either the 43kD CGllulase II (WO 91/17243) or 
Xylanaso I from Humicola insolens (WO 92/1757 3) were 
used as reporter genes. 

20 

Fig. 3. Yeast strain POld grown in YP medium added 2% galactose, 
glucose , glycerol, lactose or maltose. Used to identify 
optimal conditions for making an POld cDNA library (see 
example 1) . 

25 

Fig. 4. Frequency of cDNA sequences selected for further exami- 
nation. LI and L2 refer to the library from which the 
sequences come and the subsequent number refers to the 
clone number in the library concerned. Variation in 
30 starting point of the sequences reflects the cDNA syn- 

thesis events. The sequences that include the most of 
the 5 1 end of the sequences were used for further analy- 
sis, (see example 2-7). 

35 Fig. 5. Strategy used for sequence determination of the LI. 41 
related genomic DNA. 



Fig. 6. Nucleotide sequence of the relevant part of the LI. 41 



10 
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related genomic DNA (LI. 41 is identical to SEQ ID NO 1) . 
The positions are related to the A in the ATG start 
codon (bold), defined as +1. The putative UAS*boxes 
HOMOL1 (position -M91 - -180) and RPG (-179 - -168) and 
the T-rich sequence are underlined. The putative TATA 
box (-111 - -106) and a pyrimidine-rich sequence (-85 - 
-58) are double underlined. The putative transcription 
initiation site (-56 - ^53) is written in bold. 
Nucleotides located from position -4 0 and downstream 
were also present in the cDNA sequence. 



Fig. 7. The nucleotide sequence and the deduced amino acid se- 
quence of the translation elongation factor EF-loc cDNA 
from Y . lipolytics . Restriction sites for HindllX (posi- 
15 tion 224) and KpnJ (position 353). This sequence is 

identical to SEQ ID No 3 . 



Fig. 8. Nucleotide sequence of the L2 . 17 related genomic DNA 
(L2.17 is identical to SEQ ID NO 2). The positions are 

20 related to the A in the ATG start codon defined as +1. 

The putative UAS*boxes HOMOL1 (position -273 - -262) and 
RPG (-247 - -236) and the T-rich sequence (present on 
the opposite strand) are double underlined Putative TATA 
boxes (-201 - -190), a TATA- like sequence(-46 - -41) and 

25 transcription initiation consensus sequences (-8, -55, 

-15 and -13) are underlined. The genomic sequence (in- 
cluding the ATG start codon) also present in the cDNA 
sequence (165-173) and the 3' splice site (176-178) of 
the intron are written in bold. 

30 

Fig. 9. The nucleotide sequence and the deduced amino acid 
sequence of Ribosomal protein S7 cDNA from Y . lipolytica . 
The Kpnl restriction site (position 445) is underlined. 
This sequence is identical to SEQ ID No 4 . 



35 
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Fig.10. The strategy used for deletion analysis of the TEF gene 
yeast promoter sequence (seq. ID. No.l). The part of 
genomic sequence located upstream the cDNA sequence. (B, 
5 C , D) 5 ' deletions of th sequence in (A). As shown, 

neither of the deletions affected the putative elements 
of the basal yeast promoter region. In edition D, the 
putative UAS* boxes are deleted. 

io Fig. 11. The strategy planned for deletion analysis of the ribo- 
somal protein S7 yeast promoter sequence (seq. ID No. 
2) . Successful cloning was only obtained for B, D and F. 
In D, the putative TATA-box and the putative UAS*boxes 
are excluded. In F the TATA- like sequence and the four 

15 3 1 terminal transcription initiation consensus sequences 

are excluded. 

Fig. 12. Initial activity measurement of the yeast promoters of 
the invention (A and B) .SC-rleu growth plate + AZCL Birch 
20 xylan substrate (B) . POld XPR2 optimal medium growth 

plates + AXCL HE-cellulose substrate (C) . XPR2 optimal 
medium growth plates + AZCL Birch xylan substrate (D) . 
The vector constructions are described in Table 1. 

2 5 DETAILED DESCRIPTION OF THE INVENTION 

Cloned veast promoters 

In preferred embodiments the present invention provides 
two cloned yeast promoters. One of the promoters comprises 

30 a) the DNA sequence from position -241 to -41 shown in SEQ ID NO 
1 , or 

b) an analogue of the DNA sequence defined in a) which 

i) is at least 9 0 % homologous with said DNA sequence, 

or 

35 ii) hybridises with the DNA sequence defined in a) . 



The other promoter comprises 
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a) the DNA sequence from -163 to -3 shown SEQ ID NO 2 , or 

b) an analogue of the DNA sequence defined in a) which 

i) is at least 90 % homologous with said DNA sequence, 

or 

5 ii) hybridises with the DNA sequence defined in a) . 

The promoters of the invention may comprise additional 
nucleotides to those specified above. In particular the promoter 
may comprise nucleotides -407 to -41 of SEQ ID NO 1 or 

10 nucleotides -543 to -3 of SEQ ID NO 2, 

A cloned yeast promoter, refers to a yeast promoter 
cloned by standard cloning procedure used in genetic engineering 
to relocate a segment of DNA from its natural location to a dif- 
ferent site where it will be reproduced. The cloning process 

15 involves excision and isolation of the desired DNA segment, 
insertion of the piece of DNA into the vector molecule and in- 
corporation of the recombinant vector into a cell where multiple 
copies or clones of the DNA segment will be replicated. 

As defined herein, a DNA sequence analogous to either of 

20 the two isolated DNA sequence of the present invention is 
intended to indicate any yeast promoter DNA sequence, which DNA 
sequence has one or more of the properties cited under (i)-(ii) 
above. 

The yeast promoter DNA sequence of the invention may be 
25 isolated from a Yarrowia llpolytica yeast strain, or another or 
related organism, as will be described in further detail further 
below (see section "Microbial sources") . 

Alternatively, the promoter sequence of the invention 
may be constructed on the basis of the DNA sequence presented as 
30 DNA sequence shown in SEQ ID NO 1 or SEQ ID NO 2 , e.g. be a sub- 
sequence thereof, or a DNA sequence resulting from introduction 
of one or more nucleotide substitutions (i.e. deletions, inser- 
tions, substitutions, or addition of one or more nucleotides in 
the sequence) which do not effect (in particular impair) the 
35 yeast promoter activity. 

Regions which can be modified without significantly 
effecting the yeast promoter activity can be identified by 
deletion studies. For further details see example 8. 
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The homology referred to in i) above is determined as 
the degree of identity between the two sequences indicating a 
derivation of the first sequence from the second* The homology 
may suitably be determined by means of computer programs known 
5 in the art such as GAP provided in the GCG program package 
(NeeTileman, S.B. and Wunsch, CD., (1970) , Journal of Molecular 
Biology, 48, 443-453)* Using GAP with the following settings for 
DNA sequence comparison: GAP creation penalty of 5.0 and GAP 
extension penalty of 0.3, the coding region of the DNA sequence 
10 exhibits a degree of identity preferably of at least 90%, more 
preferably at least more preferably at least 95%, more 
preferably at least 97% with any of the DNA sequence shown in 
SEQ ID No. 1 or 2. 

The hybridisation referred to in (ii) above is intended 
15 to indicate that the analogous DNA sequence hybridises to the 
yeast promoter DNA sequence under certain specified conditions 
which are described in detail in the Materials and Methods 
section hereinafter. The oligonucleotide probe to be used is the 
DNA sequence from position -241 to -41 in SEQ ID NO 1 or from - 
20 163 to -3 in SEQ ID NO 2 . The oligonucleotide probe used herein 
is preferably a double-stranded DNA probe. 

The DNA sequence encoding a yeast promoter of the 
invention can be isolated from a suitable organism by colony 
hybridisation using a 5"-cDNA sequence from the corresponding 
25 coding sequence. 

An example of a flowscheme for such a cloning strategy is given 
just below and for further details see example 1-7. 

Construct a yeast promoter containing cDNA libraries, e.g. 
a Y .lipolytica cDNA library 

30 

Determine sequences of 100 arbitrarily chosen clones from 
each library - examine for repeats 

3 5 Identify the yeast promoter sequences of highly expressed 

genes by hybridization to a genomic library 
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Clone the yeast promoters in reporter constructs and cha- 
racterize the yeast promoter. 

Alternatively , the DNA encoding a yeast promoter of the 
5 invention may, in accordance with well-known procedures , 
conveniently be isolated from a suitable source, such as any of 
the below mentioned organisms, by use of synthetic oligonucleo- 
tide probes prepared on the basis a DNA sequence disclosed here- 
in. For instance, a suitable oligonucleotide probe may be pre- 
10 pared on the basis of nucleotide sequences presented as SEQ ID 
No. 1 or 2. 

Expression cloning 

In the present context the term "expression cloning in 
15 yeast" refers to the technique described by Dalboge and Heldt- 
Hansen (A novel method for efficient expression cloning of 
fungal enzyme genes. Mol. Gen. Genet. 243 : 253-260. (1994), WO 
93/11249) . 

The principle in the expression cloning technique is 

20 further outlined in figure 1. 

Briefly the principle of the technique is following. When 
an organism that secretes an enzyme of interest is identified, it 
is grown at inducing conditions and poly (A) enriched RNA is 
isolated. A directional cDNA library is constructed in a E. coli 

2 5 /yeast shuttle vector under control of a yeast promoter and E. 
coli is transformed. Plasmid DNA is isolated and introduced into a 
yeast strain, e.g. S. cerevisiae . The yeast cells are spread on 
selective growth plates and replicated to selective and inducing 
plates which contain the relevant enzyme substrate, e.g. xylan 

30 when screening for xylanase activity. The xylan is e.g. added as 
cross linked insoluble granules, which in the presence of xylanase 
activity will be degraded, leading to a dyed halo formation around 
the positive yeast colonies. 

When a positive yeast colony has been identified and re- 

35 tested, the cDNA is isolated and cloned in an A. oryzae expression 
vector. A. oryzae is transformed which makes large scale pro- 
duction of the enzyme possible (Christensen et al, 1988) . 
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Yeast promoter: 

Yeast promoter refers to the nucleotide sequence (s) at 
the 5 "end of a structural gene which direct (s) the initiation of 
5 transcription. The promoter sequence is to drive the expression 
of a downstream gene. The promoter drives transcription by 
providing binding sites to RNA polymerases and other initiation 
and activation factors. Usually the promoter drives 
transcription preferentially in the downstream direction. The 

10 level of transcription is regulated by the promoter. Thus, in 
the construction of heterologous promoter /structural gene com- 
binations, the structural gene is placed under the regulatory 
control of a promoter such that the expression of the gene is 
controlled by the promoter sequence (s) . The promoter is 

is positioned preferentially upstream to the structural gene and at 
a distance from the transcription start site that approximates 
the distance between the promoter and the gene it controls in 
its natural setting. As it is known in the art, some variation 
in this distance can be tolerated without loss of promoter 

20 function. 

The transcription efficiency of the promoter may, for 
instance, be determined by a direct measurement of the amount of 
mRNA transcription from the promoter, e.g. by Northern blotting or 
primer extension, or indirectly by measuring the amount of gene 
2 5 product expressed from the promoter. 

A FastA search (Pearson and Lipman. P.N.A.S. USA 85: 2 444- 
2448 (1988)) on the GenEMBL database showed significant similarity 
of the downstream cDNA sequence, controlled by the yeast promoter 
shown in SEQ ID NO 1, to the translation elongation factor EF-la 
30 gene (TEF) of various sources, e.g. Arxula adeninivorans , 
Neurospora crassa and Saccharomyces cerevisiae . 

In the present context the term "EF-la yeast promoter" 
is used to indicate the upstream untranslated region upstream of 
the ATG start codon for the EF-1 a gene (e.g. ATG start codon in 
35 SEQ ID NO 3) which contain most, if not all, features required 
for expression. For further details see Example 6. 

A similar FastA search on cDNA sequence, controlled by 
the yeast promoter shown in SEQ ID NO 2 showed significant 
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similarity to the ribosomal protein S7 (RP s7) of S. cerevisiae 
and the corresponding ribosomal protein S4 of e.g. D. melanogaster 
and if. sapiens. 

In the present context the term "ribosomal protein S7 
5 yeast promoter" is used to indicate the upstream untranslated 
region upstream of the ATG start codon for the ribosomal protein 
S7 gene (e^g. ATG start codon in SEQ ID NO 4) which contain 
most, if not all, features required for expression. For further 
details see Example 7. 

10 Both the EF-la and ribosomal protein S7 are essential 

for growth of Y. lipolytica. Thus the pH tolerance of both the 
EF-la yeast promoter and ribosomal protein S7 yeast promoter is 
as least the pH range where Y . lipolytica is able to growth. 

For both yeast promoters of the invention this is 

15 estimated to be in the pH range preferably from 4-11, more 
preferably from 4-10, more preferably from 4-9, more preferably 
from 4-8, more preferably from 5-11, more preferably from 5-10, 
more preferably from 5-9, more preferably from 5-8. 

In the context of expression cloning an ideal yeast 

20 promoter meet the following criteria: 

Strength. A strong yeast promoter is a necessary premise for a 
high expression level, and the low copy number of the arsis 
Fournier, P. et al. Yeast 7:25-36 (1991)) based expressi on vec- 
25 tors makes this demand even more important when Y . lipolytica is 
used as the host organism. 

Activity in a suitable medium . In the context of expression 
cloning a suitable medium is a medium from which it is easy to 
purify the secreted product for initial characterisation and it 
30 is a medium which is selective. 

Use of a selective medium makes it possible to screen directly 
for positive clones. 

pH tolerance. If the enzymes of interest are known to be active 
only in e.g. an acidic environment, direct screening will only 
35 be possible on corresponding plates. pH tolerance is of course 
limited by the tolerance of the host organism. 

Inducibility . A tightly regulated yeast promoter makes it 
possible to 
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separate the growth stage from the expression stage, thereby 
enabling expression of products which are known to inhibit cell 
growth . 

The Yarrowia lipolytica XPR2 yeast promoter of the prior 

5 art : 

The XPR2 gene from Y. lipolytica encodes an inducible alkaline 
extracelluar protease (AEP) which is the major protein secreted 
by this yeast (Davidow et al, 1987 b) . Induction of AEP occurs 
at pH above 6.0 on media lacking preferred carbon and nitrogen 

10 sources and full induction requires high levels of peptones in 
the culture medium (Ogrydziak et al, 1977; Ogrydziak and Scharf, 
1982) . The regulation of the XPR2 gene is very complex and not 
yet fully understood. 

The fact that the XPR2 yeast promoter is only active at 

15 pH above 6.0 on media lacking preferred carbon and nitrogen 
sources and full induction requires high levels of peptone in 
the culture medium is highly disadvantageous for the use of such 
yeast promoter in expression cloning in yeast. The demand for pH 
above 6.0 in the medium makes it impossible to screen directly 

20 for secreted enzymes that are active only in an acidic 
environment. The presence of peptone in the medium complicates 
product recovery and purification. Finally the presence of 
peptone hinders the direct screening for transf ormants based on 
LEU2 selection. 

25 In contrast to the known XPR2 yeast promoter of Y. 

lipolytica the yeast promoters of the present invention is 
active preferably in the pH range from 4 to 11 (see above) , and 
do not require peptone in the medium or any other ingredients, 
which seriously complicates product recovery and purification. 

30 Therefore the yeast promoter of the invention is highly suitable 
for use in expression cloning in yeast and recombinant 
expression in general. 

A comparative study of the XPR2 yeast promoter and the 
yeast promoters of the invention is provided in Example 9 . In 

35 example 9 it is shown that the yeast promoters of the invention 
is improved compared to the XPR2 yeast promoter, when tested for 
yeast promoter activity on growth plates, which can be 
considered as an imitation of a screening event. 
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Microbial Sources 

In a preferred embodiment, a yeast promoter of the 
invention is derived from a Yarrowia lipolytica yeast strain, 
5 It is at present contemplated that a yeast promoter of 

the invention, i.e. an analogous yeast promoter, may be obtained 
from other micro-organisms. For instance, the yeast promoter may 
be derived from other yeast strains, such as a strain of Saccha- 
romyces cerevisiae . 

10 

Expression vector 

In another aspect, the invention provides a recombinant 
expression vector comprising a yeast promoter of the invention. 

The expression vector of the invention may be any 
15 expression vector that is conveniently subjected to recombinant 
DNA procedures, and the choice of vector will often depend on 
the host cell into which it is to be introduced. 

The expression vector may e.g. be used for achieving 
expression cloning in yeast or for the production of 
20 heterologous polypeptide of interest. In the latter case, the 
expression vector comprises i) a yeast promoter of the invention 
and ii) a DNA seguence coding for a polypeptide of interest. 

In a expression vector for use in expression cloning in 
yeast, cDNA^s to be screened according to the expression cloning 
25 technique described in WO 93/11249 should be operable connected 
to a yeast promoter of the present invention and a terminator 
sequence (see WO 93/11249) . 

Further the expression vector may be used to enable 
recombinant production of a heterologous and/or homologous 
30 protein of interest, preferably an enzyme of interest. 

The procedures used to ligate the DNA sequences coding 
for the cDNA library, a DNA sequence coding for a protein of 
interest, the yeast promoter and the terminator, respectively, 
and to insert them into suitable vectors are well known to 
35 persons skilled in the art (cf., for instance, Sambrook et al., 
(1989), Molecular Cloning. A Laboratory Manual, Cold Spring 
Harbor, NY) . 
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Yeast host: cells 

In yet another aspect the invention provides a host cell 
comprising the recombinant expression vector of the invention. 
5 Preferably, the host cell of the invention is a 

eucaryotic cell, in particular a yeast cell. 

Examples of such yeast host cell include, but are not 
limited to a strain of Saccharomyces , in particular Saccharo- 
myces cerevisae, Saccharomyces kluyveri or Saccharomyces uvarum, 
10 a strain of Schizosaccharomyces sp. , such as Schizosaccharomyces 
pombe , a strain of Hansenula sp. , Plchia sp. , Yarrowia sp. , such 
as Yarrowia lipolytica, or Kluyveromyces sp., such as Kluyvero- 
myces lactis . 

Especially a strain of Yarrowia lipolytica is a suitable 
15 host for the present invention. 

Process of producing a polypeptide 

In a still further aspect, the present invention 
provides a process of producing polypeptide of interest, wherein 

20 a suitable host cell, which has been transformed with a 
expression vector comprising i) the yeast promoter of the 
invention and ii) a DNA sequence coding for a polypeptide of 
interest, is cultured under conditions permitting the production 
of the polypeptide, and the resulting polypeptide is recovered 

25 from the culture. 

The polypeptide may be a protein, e.g. an enzyme such 
as a protease, amylase or lipase. 

The medium used to culture the transformed host cells 
may be any conventional medium suitable for growing the host 

30 cells in question. The expressed polypeptide of interest may 
conveniently be secreted into the culture medium and may be re- 
covered therefrom by well-known procedures including separating 
the cells from the medium by centrif ugation or filtration, 
precipitating proteinaceous components of the medium by means of 

35 a salt such as ammonium sulphate, followed by chromatographic 
procedures such as ion exchange chromatography, affinity 
chromatography, or the like. 
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The invention is described in further detail in the 
following examples which are not in any way intended to limit 
the scope of the invention as claimed* 
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MATERXALS AND METHODS 
General methods 

If not further specified, all the experimental 
techniques referred to below were performed by standard 
5 techniques within the field of recombinant DNA technology, cf . 
Sambrook et al . , 1989. 

All experimental techniques include among others construction 
of plasmids, ligation, transformation, sequencing, hybridiza- 
tion, and etc. 

io Restriction endonucleases were purchased from New 

England Biolabs and Boehringer Mannheim and used as recommended 
by the manufacturers. T4 DNA ligase was purchased from New 
England Biolabs and used as recommended by the manufacturer. 

15 

Strains, plasmids, and transformation procedures. 

Bacterial strains used were Escherichia coli MC1061 
(Wertman, K.F. et $1 1986); S J2 , a derivative of C600 (Raleigh, 
E.A. et al 1988) ; and DH10BO (Gibco BRL) . The Yarrowia 
20 lipolytica strain used was POld (W29 derivative) ura 3-302, leu 
2-270, xpr 2-322 a (gift from Claude Gaillardin, Centre de 
Biotechnologie Agro-Industrielle , France.) 

Plasmids used are described in table I and figure 2. 
Those carrying deletions in the cloned yeast promoters are 
25 described in figure 10 and 13. All deletions were introduced 
into pY5TA- 43kD/Xl or pY5RB- 43kD/Xl. 

Y .lipolytica was transformed by electrotransf ormation . 
SJ2 and DH10B were transformed by electrotransf ormation and 
MC1061 by ordinary transformation. 

30 

Enzymes used as reporter genes: 

43kD Cellulase II from Humicola insolens described in WO 
91/17243 . 

Xylanase I from Humicola insolens described in WO 92/17573. 

35 

TABLE I. Plasmids used (except those carrying yeast promoter 
deletions) . 
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Plasmid Use/ relevant features 



Source 



pSJ1678 



PUC19 



Bacillus/^, coli shuttle vector used for 
cloning of Sau 3A I digested 
POld genomic DNA (figure 2) . 

Used for sequence determination of positives Yanisch- 
from POld genomic library originally cloned Perron, 
in pSJ1678. c. et al 

1985 



pYES 2.0 Used for cloning of POld cDNA libraries as Invitroge 
BstXl/Notl fragments. n USA 



pY343kD Y .lipolytic* expression vector based on the 
XPR2 yeast promoter and the LEU2 gene as a 
selection marker. The 43kD cellulase II from 
Humicola insolens is used as a reporter gene. 
pY343kD is similar to pY3Xl (figure 2) , where 
in pY343kD cellulase II is used as reporter 
gene in stead of Xylanase I. 

PY3X1 Y .lipolytics expression vector based on the 

XPR2 yeast promoter and the LEU2 gene as a 
selection marker. The xylanase I from 
Humicola insolens is used as a reporter gene, 
(see figure 2) . 

PY5TA4 3k Based on pY34 3kD. The XPR2 yeast promoter 

D sequence has been removed as a Clal /BamHl 

fragment and replaced by the translat ional 
elongation factor la yeast promoter sequence 
edition A cloned in this study, (see figure 
10) . 

PY5TAX1 Based on pY3Xl. The XPR2 yeast promoter 
sequence has been removed as a Clal/BamHI 
fragment and replaced by the translational 
elongation factor la yeast promoter sequence 
edition A cloned in this study, (see figure 
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10) 



PY5RB4 3k 
D 



PY5RBX1 



PY543kDC 
V 



PY5X1CV 



As pY5TA4 3kD, except that the ribosomal 
protein S7 yeast promoter sequence edition B 
cloned in this study is used as the yeast 
promoter, (see figure 11) . 

As pY5TAXl, except that the ribosomal protein 
S7 yeast promoter sequence edition B cloned 
in this study is used as the yeast promoter, 
(see figure 11) . 

Control vector based on pY3 4 3kD. The XPR2 
yeast promoter sequence has been removed as a 
Clal/BajnHI fragment and the vector religated 
after blunt ending by Mung Bean Nuclease 
treatment. 

Control vector based on pY3Xl. The XPR2 yeast 
promoter sequence has been 

removed as a Clal/BajnHI fragment and the 
vector religated after blunt ending by Mung 
Bean Nuclease treatment. 



Further details of strains: 

E. coli strains 
5 For use in the vector construction work: 

MC1061 

F" araD139 D (ara-leu) 7696 galE15 galK16 D(lac)X74 rpsL (Str r ) 
hsdR2 (r^~ m k + ) mcrA mcrBl 

o As host strain for Yarrowia lipolytica POld cDNA libraries: 
DH10B|| (Gibco BRL) 

F TncrA D ( jnrr-hsdRMS-jncrBC) F80dIacZDM15 DIacX74 deoR recAl endAl 
araD139 D(ara, leu) 7697 gralU gal K 1" rpsL nupG 

5 As host strain for Yarrowia lipolytica POld genomic library: 
S J2 : A C600 derivate (Raleigh et al, 1988). 
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For site-specific mutagenesis on pYl: 

E.coli BMH71-18 mut S. thi sup E A(lac-pro AB) [ mutS : : TnlO ] F' 
[proAB + 2acl q lac Z AM15] 

5 

Yeast strains 

Y arrow i a 1 i polyti ca 

10 

POld: ura 3-302 leu 2-270 xpr 2-322 

E129: Mat A lys 11-23 ura 3-302 leu 2-270 xpr 2-322 
E150: Mat B his -1 ura 3-302 leu 2-270 xpr 2-322 

15 ura 3-302 is a disruption of URA3 
leu 2-270 is an internal deletion 

K P r 2-322 is a deletion removing transcriptional start, ATG, and 
part of the pre-pro region. 

20 Saccharomyces cerevisiae JG169: 
W 3124: 

Mat a ura 3-52 leu 2 - 3,112 his 3-D200 pep4-1137 D 
prcl : :HIS3 prbl : :LEU 2 cir + 

2 5 Hansenula polymorpha A16: 

Transf ormant are selected on the basis of a defective Leucine 
gene. 

Schizosaccharomyces pombe 972: h ura4-294 

30 

Kluyveromyces lactis MW98-8C: Mat a uraA arg lys K* pKDl 0 

Transformation of yeast cells: 

3 5 Electro-competent yeast cells 

This S. cerevisiae protocol was used without 
modifications to make electro-competent Yarrovia lipolytica 
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POld cells, 

1. Inoculate 500 ml YPD with an aliquot from an overnight 
culture. Grow with vigorous shaking at 3 0 °C to an OD 600 of 1.3 
- 1.5 (approximately l X 10 8 cells/ml. 
5 2. Divide the culture into two centrifuge bottles and spin at 
5000 rpm for 5' at 4 °C in a Beckman centrifuge. Discard the 
supernatant . 

3.Resuspend in a total of 500 ml ice-cold sterile water and 
centrifuge as above. Discard the supernatant. 
10 4.Resuspend in a total of 250 ml. Pool the two 125 ml aliquots 
into a single bottle and centrifuge as above. Discard the 
supernatant . 

5. Resuspend in 20 ml ice-cold 1 M sorbitol. Transfer to a 
chilled 30 ml centrifuge tube. Centrifuge as above and discard 

15 the supernatant. 

6. Resuspend by adding 0.5 ml ice-cold 1 M sorbitol. Store on 
ice. 



The cells can be stored at 80 °C for several month. 
20 It is very important to keep the culture and all solutions cold 
durring the treatment of the cells. 

Culture media and growth conditions. 

Prior to the construction of cDNA libraries, initial 
25 growth experiments with POld were performed in YP medium, with 
addition of 2 % of the various carbohydrate sources tested. 
Cells were grown at 30°C MC1061 and DH10B transf ormants were 
grown in LB medium + 100 mg/ml ampicillin. S J2 , transformed with 
pSJ1678, in which POld genomic DNA was cloned as Sau 3A I 
30 fragments, was grown in LB+10 mg/ml chloramphenicol. For 
Northern blot analysis and contruction of cDNA libraries, POld 
cultures were grown in YP + 2% glucose or 2% glycerol (library 1 
and 2 respectively) at 30°C and cells were harvested late in the 
logarithmic phase at a optical density of 600 nm (OD 60 o) of 5.5. 
35 For construction of a genomic library and Southern blot analysis 
POld cultures were grown in YP-glucose. For cellulase and 
xylanase assays, positives were precultured in SC, leu medium and 
the respective inducing growth media inoculated to an OD 600 of 
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0.1. Transf ormants containing XPR2 yeast promoter based vectors 
were grown in XPR2 optimal medium. SC.leu medium was used as 
inducing medium for transf ormants in which the novel yeast 
promoter sequences were introduced. Transf ormants were grown in 
5 100 ml media in 500 ml bottles at 30°C, 250 rpm. Samples were 
taken 3 times during the logarithmic phase (SC.leu cultures 
OD 600 < °- 5 / XPR2 optimal medium cultures OD 60 o < 10 and 3 times 
during the stationary phase. 

Extraction of total RNA is performed with guanidinium 

10 thiocyanate followed by ultracentrif ugation through a 5.7 M CsCl 
cushion, and isolation of poly(A) + RNA is carried out by 
oligo(dT) -cellulose affinity chromatography using the procedures 
described in WO 94/14953. 

cDNA synthesis: Double-stranded cDNA is synthesized from 

15 5 mg poly (A) + RNA by the RNase H method (Gubler and Hoffman 
(1983) Gene 25:263-269, Sambrook et al. (1989) Molecular 
cloning: A laboratory manual, Cold Spring Harbor lab., Cold 
Spring Harbor, NY) using the hair-pin modification developed by 
F. S. Hagen (pers. comm.). The poly(A)" f RNA (5 mg in 5 ml of 

20 DEPC-treated water) is heated at 70 °C for 8 min. in a pre- 
siliconized, RNase-free Eppendorph tube, quenched on ice and 
combined in a final volume of 50 ml with reverse transcriptase 
buffer (50 mM Tris-Cl, pH 8.3, 75 mM KC1, 3 mM MgCl 2 , 10 mM DTT, 
Bethesda Research Laboratories) containing 1 mM of dATP, dGTP 

25 and dTTP and 0.5 mM 5-methy 1-dCTP (Pharmacia), 40 units human 
placental ribonuclease inhibitor (RNasin, Promega) , 1.45 mg of 
oligo(dT) 18 -Not I primer (Pharmacia) and 1000 units Superscript 
II RNase H reverse transcriptase (Bethesda Research 
Laboratories) . First-strand cDNA is synthesized by incubating 

30 the reaction mixture at 45 °C for 1 hour. After synthesis, the 
mRNA : cDNA hybrid mixture is gelfiltrated through a MicroSpin S- 
4 00 HR (Pharmacia) spin column according to the manufacturer's 
instructions . 

After the gelf iltration, the hybrids are diluted in 250 
35 ml second strand buffer (20 mM Tris-Cl, pH 7 . 4 , 90 mM KC1, 4.6 
mM MgCl 2 , 10 mM (NH 4 ) 2 S0 4 , 0.16 mM bNAD+) containing 200 mM of 
each dNTP, 60 units E. coli DNA polymerase I (Pharmacia), 5.25 
units RNase H (Promega) and 15 units E. coli DNA ligase 
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(Boehringer Mannheim) . Second strand cDNA synthesis is performed 
by incubating the reaction tube at 16 °C for 2 hours and 
additional 15 min. at 25°C. The reaction is stopped by addition 
of EDTA to a final concentration of 20 mM followed by phenol and 
5 chloroform extractions. 

Mung bean nuclease treatment: The double-stranded cDNA 
is precipitated at -20°C for 12 hours by addition of 2 vols 96% 
EtOH, 0.2 vol 10 M NH 4 Ac, recovered by centrif ugation , washed in 
70% EtOH , dried and resuspended in 3 0 ml Mung bean nuclease 

10 buffer (30 mM NaAc , pH 4.6, 300 mM NaCI, 1 mM ZnS0 4 , 0.35 mM 
DTT, 2% glycerol) containing 25 units Mung bean nuclease 
(Pharmacia) . The single-stranded hair-pin DNA is clipped by in- 
cubating the reaction at 30°C for 30 min. , followed by addition 
of 7 0 ml 10 mM Tris-Cl, pH 7.5, 1 mM EDTA, phenol extraction and 

15 precipitation with 2 vols of 96% EtOH and 0.1 vol 3 M NaAc, pH 
5.2 on ice for 3 0 min. 

Blunt-ending with T4 DNA polymerase: The double-stranded 
cDNAs are recovered by centr if ugation and blunt-ended in 3 0 ml 
T4 DNA polymerase buffer (20 mM Tr is-acetate , pH 7.9, 10 mM 

20 MgAc, 50 mM KAc, 1 mM DTT) containing 0.5 mM of each dNTP and 5 
units T4 DNA polymerase (New England Biolabs) by incubating the 
reaction mixture at 16 °C for 1 hour. The reaction is stopped by 
addition of EDTA to a final concentration of 20 mM, followed by 
phenol and chloroform extractions, and precipitation for 12 

25 hours at -20 °C by adding 2 vols 96% EtOH and 0.1 vol 3 M NaAc pH 
5.2. 

Adaptor ligation, Not I digestion and size selection: 

After the fill-in reaction the cDNAs are recovered by 
centr if ugation, washed in 70% EtOH and dried. The cDNA pellet is 

30 resuspended in 25 ml ligation buffer (30 mM Tris-Cl, pH 7.8, 10 
mM MgCl 2 , 10 mM DTT, 0.5 mM ATP) containing 2.5 mg non- 
palindromic BstXI adaptors (Invitrogen) and 3 0 units T4 ligase 
(Promega) and incubated at 16 °C for 12 hours. The reaction is 
stopped by heating at 65 °c for 20 min. and then cooling on ice 

35 for 5 min. The adapted cDNA is digested with Not I restriction 
enzyme by addition of 20 ml water, 5 ml lOx Not I restriction 
enzyme buffer (New England Biolabs) and 50 units Not I (New 
England Biolabs), followed by incubation for 2.5 hours at 37°c. 
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The reaction is stopped by heating at 65 °C for 10 min. The cDNAs 
are size-fractionated by gel electrophoresis on a 0.8% SeaPlaque 
GTG low melting temperature agarose gel (FMC) in lx TBE to 
separate unligated adaptors and small cDNAs . The cDNA is size- 
5 selected with a cut-off at 0.7 kb and rescued from the gel by 
use -of p-Agarase (New England Biolabs) according to the manu- 
facturer's instructions and precipitated for 12 hours at -2 0°C 
by adding 2 vols 96% EtOH and 0.1 vol 3 M NaAc pH 5.2. 

io Construction of directional cDNA libraries. 

Total RNA was extracted and poly(A)+RNA isolated. From 
500 ml cultures (» 2.75 x 10 10 cells) a yield of 1.9 mg and 2.9 
mg total RNA was obtained (library 1 and 2 respectively) . 
Isolation of poly(A)+RNA yielded 1.1 % and 2.2 % (library 1 and 

15 2 respectively) . Double stranded cDNA was synthesised from 5 mg 
poly (A) +RNA as described. The method includes introduction of a 
3' NotI site by the oligo (dT) -NotI anchor primer. Size estima- 
tion of the double stranded cDNA on 1 % agarose showed a distri- 
bution of the product between 0.3 Kb and 10 Kb. Removal of the 

20 single stranded hairpin DNA by Mung bean nuclease treatment and 
blunt ending with T4 DNA polymerase was followed by ligation of 
non-palindromic BstXI adaptors. After NotI digest a 5 'BstXI 
3'NotI product was obtained. The cDNA was size fractionated on a 
0.8 % low melt agarose gel and fragments > 0.8 Kb were purified. 

25 Test ligations with different amounts of BstXI /NotI digested 
pYES 2.0 vectors, followed by electrotransf ormat ion of DH10B , 
did not result in saturation of the vector in any cases. Thus 
the sizes of the libraries were estimated to be at least 2 x 10 6 
and 3 x 10 5 (library 1 and 2 respectively) . 

30 

Sequence determination. 

All sequence determinations were performed on ABI 373 or 
377 DNA Sequencer and analysed by use of Sequencer© 2.1 or 3.0 
(Gene Codes Corporation, USA) . Initial sequence determination of 
35 cDNA clones was performed only on one strand at the 5 ! end by 
use of a single primer. Selected cDNA clones were sequenced on 
both strands except at the sequence just upstream the poly A 
tail, were a single strand was sequenced twice with different 
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primers. Sequence determination of positive clones from the 
genomic libraries was not possible when present in the pSJ1678 
background, why these inserts were cloned in pUC19 prior to 
sequence determination of both strands. The genomic insert that 
5 responded to the LI. 41 based probe was cloned in pUC19 as a Sail 
fragment, and the insert that responded to the L2 . 17 based probe 
was cloned as a Pstl fragment, due to the presence of the 
remaining practicable cloning sites internal in these inserts. 

10 Southern and Northern blots. 

Southern blot analysis was carried out at standard 
conditions. For each analysis 4 x 10 mg POld genomic DNA was 
digested to completion (20 unit enzyme, 3 hours incubation + 
additional 10 unit enzyme and 2 hours incubation) and 

15 fractionated on 1 % Sea Kern GTG agarose (FMC Bioproducts) . To e- 
xamine for DNase contamination 5 mg genomic DNA was incubated 
for 5 hours in one of the restriction buffers used for 
digestion. Polymerase chain reaction (PCR) copies of the re- 
spective cDNA's were used as probes. Radioactive labelling of 

20 the DNA by random priming was carried out. 

Northern blot analysis was carried out at standard 
conditions. 2 x 2.5 mg poly(A)+RNA from library 1 and 2 was 
fractionated on 1 % Sea Kern GTG agarose gels. One gel was used 
for ethidium bromide staining {60 f in 0,1M NH 4 Ac, 0.5 mg/ml 

25 EtBr) one gel was used for blotting. The same probes as in the 
Southern blot analysis were used. The membrane was exposed for 
45 1 prior to development. 

Preparation of total DNA from yeast: 

30 The optimal method for preparation of total DNA from 

yeast depends on the yeast strain. In case of Yarrowia 
lipolytica a modified S. cerevisiae protocol has been used. 

1. Inoculate 2 0 ml YPD in a 100 ml shake flask and ferment at 
35 30°C O.N. 

2. Spin 5 1 at 5000 rpm. 
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3. Remove supernatant and resuspend cells in 400 \xl 0.9 M 
Sorbitol, o.l M EDTA pH 7.5, 14 mM (3-mercaptoethanol . 

4. Add 100 \il Novozym (2 mg/ml) and incubate 30 1 at 37°C. (At 
this point one should be able to monitor spheroplast 

5 formation) 

5. Spin 30 ft in a microfuge. 

6. Remove supernatant and resuspend pellet in 400 \xl TE + 5 \il 
10X RNase A +T (boil 10 1 before use). 

7. Add 90 nl of fresh made 1.5 ml 0.5 M EDTA pH 8.0 (final 280 
io mM) , 0.6 ml 2 M Tris (final 444 mM) , 0.6 ml 10 % SDS (final 

2.2 %) 

8. Incubate 30' at 65 °C 

9. Add 8 0 yil 5 M KAc. Vortex and leave on ice 30 f . 
10. Spin 15 1 at 20.000 G. 
15 11. Transfer supernatant to a new tube. Add l vol. 
phenol/chloroform, vortex and leave at 65 °C 10'. Spin 5 1 at 
20.000 G. 

12. Transfer upper fase to a new tube and add 1 vol. chloroform. 
Vortex and spin. Move upper fase to a new tube. Add 3 x vol. 
20 96 % EtOH. Mix carefully. 
13. Spin 5' at 20.000 G. 

14. Wash pellet in 70 % EtOH and dry pellet at R.T. 

15. Resuspend pellet gently in 250 jil of a suitable buffer. 

25 Construction of a genomic library. 

Total POld DNA was prepared from a YPD culture. Partial 
Sau3A I digested DNA was fractionated on a 1 % low melt agarose 
gel and 1 - 2 Kb fragments were purified by use of (3-agarase. 
DNA fragments were ligated with pSJ1678 from which the 

30 kanamycine resistance gene was deleted as a BamHI fragment and 
introduced into SJ2 by electrotransf ormation . A library > 
45.000 clones was established with a vector background level < 
0.5 %. According to the formula by Clark and Carbon (197 6) this 
library size corresponds to a 95 % probability that an arbitrary 

35 sequence is represented: N = ln(l fc P) / ln(l 4 f), where p equals 
the probability that a given unique DNA sequence is present in a 
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collection of N transformant colonies and f is the fraction of 
the total genome used as a source for each fragment. The calcu- 
lation relies on the assumptions that the Sau3A I restriction 
sites are randomly distributed throughout the genome , that all 
5 fragments ligate equally well within the size distribution of 
the fragments (average size = 1.5 Kb), and that the POld genome 
has a size of 20 Mb. 

Plasmid DNA was isolated from 20 colonies and cut with 
ffindlll. Analysis on 1 % agarose gel showed a random distribu- 
10 tion of inserts between 1 Kb and 2 Kb (data not shown) . 

Colony hybridization. 

Coloni hybridization was performed at standard condi- 
tions. 5 replicas of the genomic library were made (3 for hybri- 
15 dization, l for plasmid DNA preparation and 1 backup) . 4 00 bp 
PCR copies of the 5* end of the respective cDNA clones were used 
as probes. 

PCR and cloning of yeast promoter sequences. 

20 PCR was performed at standard conditions, except for 

primer annealing temperature, which was raised to 60°C. All 
products were purified prior to further use with QIAquick PCR 
Purification kit (QIAGEN) as recommended by the manufacturer. 

PCR copies of the identified yeast promoter sequences 

25 were cloned in the expression vectors as 5' Clal 3 ' BamHl 
fragments. These cloning sites were introduced by the PCR pri- 
mers. The number of bases flanking the restriction sequences on 
the primers was selected as recommended by New England Biolabs®. 
In the Ribosomal protein S7 yeast promoter sequence, a BamHI 

30 site was present close to the 5 1 end. Therefore, only the se- 
quence located downstream this site was introduced in the ex- 
pression vectors. Further, an internal Clal site was present at 
position > 2 69 from the putative translational ATG start codon. 
PCR editions affected by this restriction site were digested 

35 with BaroHI, purified with QIAquick PCR Purification kit 
(QIAGEN), followed by a partial digest with Clal. Relevant 
fragments were purified from 1.5 % low melt agarose gel prior to 
ligation and transformation of MC1061. Despite repeated efforts, 
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succesful cloning was only obtained with some of the Ribosomal 
protein S7 yeast promoter editions. Due to time constraints, 
this cloning problem was not examined any further. The different 
PGR copies of the Elongation factor 1 a yeast promoter sequence, 
5 and the edition of the Ribosomal protein S7 yeast promoter 
unaffected by the Clal site were Clal/Baml digested, purified 
with QIAquick PCR Purification kit (QIAGEN) , and introduced into 
MC1061. Succesful cloning was verified by Clal/Baml digest of 
Qiagen DNA preparations and analysis on 2 % agarose gel. All the 
10 cloned PCR copies of the yeast promoters were sequenced and 
compared with the original genomic sequences in order to examine 
for misincorporations. No errors were detected. 

Control vectors 

15 Control vectors for the enzyme activity experiments were 

constructed on the basis of pY3- 43kD/Xl. The XPR2 yeast pro- 
moter sequences were removed as Clal/BamHI fragments, vectors 
were blunt ended by Mung bean nuclease treatment, religated and 
amplified in MC1061. Succesful nuclease treatment was verified 

20 by restriction analysis and confirmed by sequence determination. 

Enzymatic activity assays: 

Cellulase I r Cellulase II. and Xvlanase I: 

Activity determination of Cellulase I, Cellulase II, and 
25 Xylanase I was made in liquid assays by the use of AZCL (AZurin 
dyed Cross Linked) substrates (MEGAZYME Australia) . 

Mix in an eppendorf tube: 

• 100 - 145 |ul optimal buffer 

30 • 50 nl supernatant (in case of supernatant samples SI - S3 from 
xylanase I cultures only 5 \x±) sample blank, standard or 
standard blank. 

• 100 |^1 0.4 % AZCL substrate in milliQ water. 

35 All samples are placed on ice durring the treatment. 

Incubate in a thermomixer at 40 °C, 1200 rpm. for 15 - 45 
minutes (Standard OD 620 > 0.2). 
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Spinn 5 1 at 20.000 G and measure OD 62 o of the substrate- 
grain free supernatants . All samples are measured in duplicates. 
Sample and standard values are corrected for absorbance of the 
5 relevant blank. The activity value is calculated as Sample 
activity/ Standard activity. 



Enzyme 



Cellulase I 



Cellulase II 



Galactanase I 



Xylanase I 



AZCL substrate 



HE-Cellulose 



HE-Cellulose 



Arabinogalactan 



Birch-xylan 



Optimal buffer 



Standard 
(purified 
native enzyme) 



°-i M i.72 ug/ni 

citrate/phos. diluted 1200 

P H times 
5 . 5 

0.1 M Tris pH li5 Hg/til 
7 - 5 diluted 200 

times 

0-1 M 2 ixg/ixl diluted 

citrate/phos. 6000 times 
PH 
4 . 5 

0.1 M Tris pH i <47 |ig/Hl 
7 *° diluted 30.000 

times . 



io Polygalacturonase I <PG I) and Lipase activity : 

The enzyme activity of supernatants from both PG I and 
lipase cultures (chapter 5) was measured on substrate-containing 
plates. 20 jLil supernatant from each sample was loaded in wells 
and the area of the clearing zone was related to the clearing 
15 zones of a titration of a known amount of the respective native 
enzymes . 

PG I: 1.0 g agarose is added 100 ml 0.1 M 
citrate/phosphate buffer pH 4.5. The suspension is heated to the 
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boiling point and lg Obipectin (DE 35 %, NN Switzerland) 
prewetted in 9 6 % EtOH is added prior discharge of 2 5 ml 
aliquots on plates. 2 0 |il supernatant or standard is added in 
wells and incubated 24 hours at 3 0°C. 1 % MTAB (mixed alkyl- 
5 trimethylammoniumbromide, Sigma®} is poured over the plates and 
incubates at RT untill the clearing zones are detectable. 

Lipase: 2.0 g agarose is added 10 ml 1M Tris pH 9, 5 ml 
2M CaCl 2 and 85 ml H 2 0. The suspension is heated to the boiling 
point and a mixture of 0.5 ml Olive oil and 1 ml Triton X-100 is 
10 added prior discharge of 25 ml aliquots on plates. 20 jul 
supernatant or standard is added in wells and incubated 2 4 hours 
at 30°C. 

Western blotting 

15 SDS -PAGE electrophoresis, western blotting and immunostaining . 
2 5 ml supernatant from the respective maximum activity per 
volume samples was loaded in each lane. 

Hybridization 

20 Suitable hybridization conditions for determining hybridization 
between an nucleotide probe and an "analogous" DNA sequence of 
the invention may be defined as described below. The oligonu- 
cleotide probe to be used is the DNA sequence from position -241 
to -41 in SEQ ID NO 1 or from -163 to -3 in SEQ ID NO 2 . The 

25 oligonucleotide probe used herein is preferably a double- 
stranded DNA probe. 

Hybridization conditions: 

Suitable conditions for determining hybridization between a 
30 nucleotide probe and a homologous DNA or RNA sequence involves 
presoaking of the filter containing the DNA fragments or RNA to 
hybridize in 5 x SSC (standard saline citrate) for 10 min, and 
prehybridization of the filter in a solution of 5 x SSC (Sam- 
brook et al. 1989), 5 x Denhardt 1 s solution (Sambrook et al. 
35 1989), 0.5 % SDS and 100 ^g/ml of denatured sonicated salmon 
sperm DNA (Sambrook et al. 1989), followed by hybridization in 
the same solution containing a concentration of lOng/ml of a 
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random-primed (Feinberg, A. P. and Vogelstein, B. (1983) Anal. 
Biochem. 132:6-13), 32 P-dCTP-labeled (specific activity > 1 x 
10 9 cpm/Mg ) probe for 12 hours at ca. 45°c. The filter is then 
washed two times for 30 minutes in 2 x SSC, 0.5 % SDS at pre- 
5 ferably not higher than 55 °C, more preferably not higher than 
60°C / more preferably not higher than 65°C / even more preferably 
not higher than 70°C, especially not higher than 75°C. 

Molecules to which the oligonucleotide probe hybridizes 
under these conditions are detected using a x-ray film. 

10 

Media: 

E.coli media: 

For selective growth of transf ormants : 

15 

LB medium (Luria-Bertani) : 1 % Bacto tryptone, 0.5 % Bacto yeast 
extract, 0.5 % NaCl + relevant antibiotics as described. Growth 
plates are made with 2 % Bactoagar. 

For growth of electro-transformed cells for 1 hour, 
20 prior plating on selective medium: 

SOC medium: 2 % Bacto tryptone, 0.5 % Bacto yeast extract, lOmM 
NaCl, 2.5 mM Kcl, 10 mM MgCl 2 , 10 mM MgS0 4 , 2 0 mM glucose. 

25 

Yeast media: 

Selective medium for Yarrowia lipolytica transf ormants : 
Syntetic Complete medium -r Leucine 

30 

1 liter medium: 

100 ml 10X Basal salt 

100 ml 10X glucose (20 %) 

100 ml 10X Amino acids + leucine + vitamin 
35 Add H 2 0 to 1000 ml. Sterile filtrate. 

Growth plates are made with 1.5 % agarose. Autoclave the agarose 
in H 2 0 then add the remainder. 
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Amino acid stock solutions: 
(SIGMA Cell Culture Reagent®) 



Constituent 



Final mg/1 



Constituent 



Final mg/1 



Adenine sulphate 2 0 

Uracil 20 

Tryptophan 2 0 

Histidine 20 

Arginine 20 

Methionine 20 

Tyrosine 30 

Isoleucine 30 

Lysine 30 

Phenylalanine 50 

Glutamic acid 100 

Aspartic acid 100 

Valine 150 

Threonine 200 

Serine 400 



Biotin (vitamin H) 0.05 

Thiamine HC1 5 
(vitamin B) 

myo-INOSITOL 47 

Pyridoxine Hcl 1.2 
(vitamin B6) 
Pantothenic acid 23 



10X Basal salt: 
1 liter; 

66.8 g Yeast Nitrogen Base w/o amino acids (Difco) , 100 g 
Succinic acid, 60 g NaOH. Add H 2 0 to 1 liter and sterile 
10 filtrate. 



Inducing medium 
transf ormants : 



for 



XPR2 



based Y arrow ia lipolytica 



15 XPR2 optimal medium (a modification of the YPDm media (Nicaud, 
J.M. et al 1989) ) : 



0.1 % glucose 
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0.2 % (w/v) Yeast Extract 

10 % (w/v) Proteose Peptone (DIFCO) 

Bring Yeast Extract and Proteose Peptone to solution in 50 mM 
NaHP0 4 pH 6.8. Autoclave and add the glucose. For growth plates 
5 add 1.5 % agar before autoclavation . 

YP medium (per 1) : 2 0 g Bacto peptone, 10 g Bacto yeast extract. 

EXAMPLES 
io Example 1: 

Construction of cDNA libraries. 

A cDNA library can be considered as an image of the tran- 
scriptional activity in the cell at the growth conditions present. 
The aim was to identify strong yeast promoters that were active at 

15 conditions suitable for use in expression cloning. Therefore the 
ability of strain POld to assimilate various carbohydrate sources 
was examined prior to the construction of cDNA libraries (figure 
3) . Assimilation of carbon compounds in terms of + or > has been 
examined for some of the first Y . lipolytica strains isolated 

20 (reviewed by Lodder, J. 197 0) and a slight variation among the 
strains was observed. In the present growth experiment carbo- 
hydrates of both categories were tested. 

The growth experiment (figure 3) clearly demonstrated that 
strain POld is capable of utilizing glucose and glycerol as carbo- 

25 hydrate sources. The indication of weak assimilation of maltose is 
in agreement with the observations by Lodder. In the attempt to 
identify not only strong but also inducible yeast promoters, it 
was decided to construct cDNA libraries from both YP -glucose and 
-glycerol cultures. The idea was that if the presence of glucose 

30 or glycerol caused distinct patterns of induction or repression of 
yeast promoters (e.g. a glucose repression effect) this would 
appear in a comparison of sequences from the two libraries. 

EXAMPLE 2 : 
35 Analysis of cDNA libraries. 

An initial sequence determination was performed on 100 clones from 
each cDNA library in which 300 - 600 nucleotides of the 5 1 end of 
the inserts were determined. The sequence data from each library 
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were aligned internally and to each other. In the following the 
cDNA library from the YP-glucose culture is refered to as LI and 
the library from the YP-glycerol culture as L2 . 

7 different sequences from LI were represented twice and 
5 two different sequences were represented as triplets. It turned 
out that one of the pairs came from identical clones which is 
possible due to the growth of the transformed cells for 1 hour in 
liquid medium prior transfer to growth plates. 14 different 
sequences from L2 were represented twice, of which two pairs came 

10 from identical clones. Alignment of LI and L2 showed that several 
sequences from one library also were represented in the other. 
Four sequences of the 200 clones examined were chosen for further 
examination (figure 4) : One representing the sequence observed 
most frequently (A) , one representing the most frequently sequence 

15 observed only in LI (B) , one representing the second most observed 
(C) , and one representing a sequence observed twice in L2 and not 
in LI (D) . 

EXAMPLE 3: 

2 0 Comparative measurement of transcription frequency. 

The detection of a sequence in two or three copies in only one of 
the cDNA libraries could indicate that different yeast promoter 
activity was present in the YP -glucose and -glycerol media. To 
test this, a Northern blot analysis was performed. PCR copies of 

25 the selected cDNA sequences were hybridized to poly (A) +RNA from 
the YP -glucose and -glycerol cultures respectively. If the 
frequency of the different cDNA sequences reflects the quantity of 
the corresponding transcripts, unequal intensity of signals could 
be expected when probes based on the LI. 4 5 or L2.17 sequences were 

30 used. 

The intensity of the signals does not differ in any case, 
independently of the origin of the poly(A)+RNA, this was not 
observed at shorter exposure either. These data indicates that no 
significant repression of transcription has taken place concerning 
35 the sequences examined, either in the presence of glucose or 
glycerol in the medium. 
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EXAMPLE 4: 

Copy number analysis. 

A high frequency of a specific sequence reflects the presence of a 
strong yeast promoter. This is of course quite a simplification. A 
5 high frequency could also be caused by a high mRNA stability or a 
high copy number of the gene. If all these copies were actively 
transcriped, the yeast promoter strength in terms of sequence 
frequency would be correspondingly low. In order to examine for 
copy number on the genome of the selected genes a Southern blot 
10 analysis was carried out. 

It appears, that the probes based on the LI. 41 (A), LI. 45 (B) or 
L2.17 (D) sequences, hybridize at one or two areas on the mem- 
brane, depending on the enzymes used. Sequence determination of 

15 the entire cloned cDNA of LI. 41 and L2 . 17 (see later) showed that 
in the case of LI. 41, both a Hindi I site and a Kpnl site was 
present in the structural gene, and in case of L2 . 17 a Kpnl site 
was present. A PGR copy of the LI. 45 sequence was digested with 
the enzymes employed in the Southern analysis. Electrophoresis on 

20 agarose gel showed that an internal Kpnl site was present (data 
not shown) . This strongly indicate that the POld genome contains 
only one copy of both LI. 41, LI. 4 5 and L2.17. 

The L2.7 based probe hybridize at several distinct areas 
of the membrane. Digest of a PCR copy of the L2 . 7 sequence with 

25 the employed enzymes did not reveal an internal presence of these 
sites (data not shown). This show that the L2 . 7 sequence is pre- 
sent in several copies in the POld genome. For this reason no 
attempt was made to identify and test the yeast promoter matching 
the L2 . 7 sequence. 

30 

EXAMPLE 5: 

Identification of putative yeast promoter sequences by colony 
hybridization. 

To identify the yeast promoters matching the strongly expressed 
35 transcripts, a POld genomic library was established. 1 - 2 Kb 
Sau3A I digested POld fragments were cloned in BamHI digested 
pSJ1678 (figure 2). 45.000 transf ormants were obtained correspon- 
ding to a 95 % probability of an arbitrary sequence is 



WO 97/44470 



PCT/DK97/00232 



-35- 

represented. Prior coloni hybridization, PCR copies of LI. 41, 
LI. 45 and L2.17 were digested with Sau3A I. Analysis on agarose 
gel showed a significant reduction in fragment size (data not 
shown) , due to the presence of internal Sau3A I sites. In order to 
5 increase the probability of identifying sequences located upstream 
the ATG start codon of the selected genes, 4 00 bp PCR copies of 
the 5* end of the selected cDNA sequences were used as probes. 

Colony hybridization resulted in one positive for each 
probe used. Unfortunately, the positive corresponding to LI. 45 was 

10 lost during the process of isolating the positives. Hindlll digest 
showed that the clones responding to the LI. 41 and L2.17 based 
probes contained inserts of approximately 1000 and 1500 bp, 
respectively. Sequence determination of the positive inserts was 
not possible when present in the pSJ1678 background and hence the 

15 insert were recloned in pUCl9 prior to sequencing. 

EXAMPLE 6: 

Sequence determination of the Ll.41 related genome and cDNA 
sequence. 

20 The 915 bp Ll.41 related genomic DNA was sequenced at both strands 
as illustrated in figure 5. 

Alignment of the Ll.41 related genomic DNA with the 
corresponding cDNA revealed a 549 bp overlap between 3 1 genomic 
DNA and 5 1 cDNA, corresponding to a 366 bp sequence in the genomic 

25 DNA located upstream the cDNA sequence (figure 6.) The sequence in 
figure 6 is the same as shown in SEQ ID NO 1. 

The nucleotide sequence and the deduced amino acid 
sequence of the Ll.41 cDNA is presented in figure 7. The sequence 
in figure 6 is the same as shown in SEQ ID NO 3 . 

30 The 1500 bp cDNA clone contains a 1380 bp open reading frame 
initiated with an ATG codon 41 bp downstream the 5 ! terminal 
nucleotide, and terminated with a TAA stop codon 14 21 bp down- 
stream the 5 1 terminal nucleotide, thus predicting a 4 60-residue 
polypeptide of 50065 Da. The open reading frame is preceded by a 

35 40 bp 5 1 -noncoding region and followed by a 60 bp 3' -noncoding 
region and a poly (A) tail. 

An initial FastA search (Pearson and Lipman, 1988} on the 
GenEMBL database showed significant similarity of the Ll.41 cDNA 
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sequence to the translation elongation factor EF-la gene (TEF) of 
various sources, e.g. Arxula adeninivorans, Neurospora crassa and 
Saccharomyces cerevisiae (see appendix II) . 

GAP alignments (Needleman and Wunsch, 1970) were made with 
5 the complete LI. 41 cDNA sequence aligned to the TEF gene sequen- 
ces of yeasts A. adeninivorans and S .cerevisiae . 8 3.8 and 7 6.4 
percent similarity was observed, respectively. 

At the amino acid level a BLASTX search (Altschul et al, 
1990) , on the swissprot database, showed as much as 91 percent 
10 identity to the elongation factor la of A. gossypii. To the 
corresponding genes of C. albicans, A. adeninivorans and S. 
cerevisiae 90, 90 and 89 percent identity was observed, 
respectively . 

The elongation factor la plays an essential role in 
15 protein synthesis in eukaryotic cells by binding the amino-acyl t- 
RNA to the ribosomes in exhange for the hydrolysis of GTP. 

EXAMPLE 7: 

20 Sequence determination of the L2 . 17 related genome and cDNA 
sequence. 

The 1435 bp L2 . 17 related genomic DNA was sequenced at both 
strands. 

The relevant part of the nucleotide sequence of the L2 . 17 
25 genomic DNA is shown in figure 8. The sequence in figure 8 is the 
same as shown in SEQ ID NO 2. 

Alignment of the L2.17 related genomic DNA with the 
corresponding cDNA showed that 7 59 bp of the genomic sequence was 
located upstream the 5» end of the cDNA sequence. Further, the 
30 alignment strongly indicated the presence of an intron of 165 
nucleotides: A 16 bp sequence in the genomic DNA (position ^2 - 
+14) , including the putative ATG start codon, matched perfectly 
with a sequence located in the 5 • end of the cDNA sequence . There 
was no homology between the genomic DNA and the cDNA in the 
3 5 intervening sequence. 

The 16 bp genomic sequence was followed by a 5 ' splice 
site consensus sequence - GTGAGT (figure 8) at position 15 - 2 0. A 
3 f splice site consensus sequence - CAG (position 177 - 179) - was 
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present in the 3' end of the 165 nucleotide intron, and an 
internal consensus sequence for lariat formation - TA 3 CTAAC 
(position 165 - 173) was present just upstream the 3 1 consensus 
sequence. These intron processing signals are very similar to 
5 those that define introns in other organisms and the signals 
present in the intron of the pyruvate kinase-encoding gene (PYK) 
of Y .lipolytica (table II) . 



Table II. Consensus intron 
10 species and the signals of the 
intron. * 



processing signals from several 
Y .lipolytica PYK gene 



5 1 

splice site 



Internal 
conserved 
sequence 



3 * 
site 



splice 



Higher 
eukaryotes 



GGT A/G AGT 



CT A/G A T/C 



TAG 



S . pombe 

N . crassa 

S . cerovisiae 

Y . lipolytica PYK 
intron 



GT 



GT 



GGTA - CT A/G A T/A AG 

T/C 

- GTA- T - CTAAC T/C AG 



GGTATGT 



GGTGAGT 



TACTAAC 



TACTAAC 



T/C AG 



CAG 



Y . lipolytica 
L2.17 intron 



GTGAGT 



TA 3 CTAAC 



CAG 



*Y .lipolytica L2 . 17 splice sites and signals shown along with the 
15 corresponding consensus signals from other organisms and the 
Y .lipolytica PYK gene. Consensus sequences of higher eukaryotes, 
S. pombe , N . crassa, S .cerevisiae (Hindley and Phear, 1984; Kaufer 
at al, 1985). Y. lipolytica PYK gene (Strick et al, 1992). 

As described earlier, it can be quite difficult to predict 
20 the transcription initiation site on the basis of consensus 
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sequence data alone. In case of the TEF gene yeast promoter 
sequence, the presence of a CT rich sequence pointed at a single 
probable site. In case of the L2 . 17 genomic sequence, several 
initiation sites seems possible: 
5 Four initiation consensus sequences are located between the 
putative TATA boxes (position ,201 - ,190) and the ATG start 
codon (see figure 8) . 

The fact that the cDNA sequences represented by L2.17 (L2.17 
and L2.32 figure 4) both have their 5 1 end very close to the ATG 

10 start codon could indicate that they represent almost full length 
cDNA clones. This assumption is further supported by the presence 
of two transcription initiation consensus sequences just upstream 
the 5 f end of the cDNA sequence. 

The assumption of a transcriptional start site just 

15 upstream the ATG start codon disagree with the observation of an 
average leader sequence length in yeast of 52 bp (reviewed by Yoon 
and Donahue 1992) . Further, this position of the transcription 
initiation sites is far out of the range of the transcription 
initiation window (4 0-12 0 bp downstream the ^2 01 - ,190 TATA bo- 

20 xes) . 

The existence of several possible transcription initiation 
sites was examined further in a yeast promoter deletion analysis 
(see below) . 

The L2.17 cDNA was sequenced at both strands. The nucleo- 
25 tide sequence and the deduced amino acid sequence of the L2 . 17 
cDNA is presented in figure 9. The sequence in figure 9 is the 
same as shown in SEQ ID NO 4. 

The 853 bp cDNA clone contains a 780 bp open reading frame 
initiated with an ATG codon 3 bp downstream the 5 1 terminal 
30 nucleotide, and is terminated with a TAA stop codon 783 bp 
downstream the 5' terminal nucleotide, thereby predicting a 260- 
residue polypeptide of 2 919 3 Da. The open reading frame is 
preceded by a 2 bp 5 1 -noncoding region and followed by a 49 bp 3 1 
-noncoding region and a poly (A) tail. 
3 5 A Fast A search (Pearson and Lipman, 1988) on the GenEMBL 

database showed similarity of the L2 . 17 cDNA sequence to the 
ribosomal protein S7 of S. cerevisiae and the corresponding ribo- 
somal protein S4 of e.g. D . melanogaster and H . sapiens. 



WO 97/44470 



PCT7DK97/00232 



-39- 

GAP alignments (Needleman and Wunch, 1970) showed 69.2 
percent similarity to a ribosomal protein S4 cDNA sequence of D . 
melanogaster and 68-5 % similarity to exon 1 and exon 2 of the S. 
cerevisiae ribosomal protein S7 . 
5 At the amino acid level a BLASTX search (Altschul et al, 

1990) , on the swissprot database, showed 82 percent identity to 
the ribosomal protein S7 of S. cerevisiae, 74 percent identity to 
the ribosomal protein S4 of D . melanogaster and 72 percent 
identity to the ribosomal protein S4 of M. musculus and H. 
10 sapiens. 

The ribosomal protein S7 is the largest protein of the 4 0 
S subunit and is essential for growth (Synetos et al 1992) . 

EXAMPLE 8: 

15 Strategy for deletion analyses of the cloned yeast promoters, 

A detailed analysis of the function of a yeast promoter 
involves sequence deletion studies as well as DNA/protein and pro- 
tein/protein interaction analyses. 

20 Elongation factor l_a (TEF) yeast promoter deletions: 

The strategy used for deletion studies of the TEF gene 
yeast promoter sequence (SEQ ID No 1) is shown in figure 10. The 
3 1 terminal nucleotide of the yeast promoter sequence was defined 
to be equal, to the last nucleotide in the 5' part of the genomic 

2 5 sequence that was not represented in the cDNA sequence. This 
definition is in agreement with the position of the putative 
transcription initiation site, except for the presence of 
additional 12 bp located downstream the putative transcription 
initiation site. All editions of the TEF gene yeast promoter 

30 sequence were cloned as ClaX/BamHI fragments in pY5 expression 
vectors carrying cellulase II or xylanase I as reporter genes (see 
table I. and figure 2). The yeast promoter sequences were cloned 
in the expression vectors as PCR copies, in which the 5» Clal site 
and the 3 1 BamHI sites were introduced by the PCR primers. 

35 The deletion study shown that the DNA sequence from 

position -241 to position -41 in SEQ ID No 1 comprise the 
essential element for yeast promoter activity. 



WO 97/44470 



PCTYDK97/00232 



-40- 

Ribosomal protein S7 veast promoter deletions: 

The planned strategy for deletion analysis of the 
Ribosomal protein S7 yeast promoter sequence (SEQ ID No 2) is 
5 shown in figure 11. Due to the problems with identification of a 
single probable transcription initiation site, 5' deleted editions 
with two different 3' ends (position ,3 and ,109) were introduced 
to the expression vectors. All editions of the ribosomal protein 
S7 yeast promoter sequence were cloned as Clal/BamHI fragments in 

10 pY5 expression vectors carrying cellulase II or xylanase I as 
reporter genes (see table I. and figure 2). Editions affected by 
the internal Clal site were prepared by partial Clal digest as 
described earlier. Succesful cloning was only obtained for B, D 
and F (figure 11) . The yeast promoter sequences were cloned in the 

15 expression vectors as PCR copies, in which the 5' Clal site and 
the 3' BamHI sites were introduced by the PCR primers. 

The deletion study shown that the DNA sequence from 
position -163 to position -3 in SEQ ID No 2 comprise the essential 
element for yeast promoter activity. 

20 

EXAMPLE 9: 

Comparative yeast promoter activity studies. 

The expression vectors based on the yeast promoter sequences of 
the invention were tested with regard to their suitability as 

25 expression cloning tools. The activity level of the yeast 
promoters was examined both when POld transf ormants were grown on 
selective substrate-containing plates, and when transf ormants were 
grown in selective liquid medium. Finally the test gene products 
were examined in a Western blot analysis. As a consequence of the 

30 Northern blot analysis results, transf ormants containing either of 
the new yeast promoters were grown on/ in media in which glucose 
was used as the carbohydrate source. 

Activity on growth plates. 
3 5 The activity level of the yeast promoters of the invention 

was initially tested qualitatively, by growth of POld 
transf ormants on selective substrate-containing plates (figure 12. 
A and B) . Only transf ormants that include the "full length" 
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editions of the cloned yeast promoters are shown (TEF gene yeast 
promoter = pY5TA4 3kD and pYSTAXl, Ribosomal protein S7 yeast 
promoter = pY5RB43kD and pYSRBXl) . POld transformed with the 
corresponding expression vectors based on the XPR2 yeast promoter, 
5 grown on XPR2 optimal medium substrate-containing plates, are 
shown in figure 12. C and D. The experiment can be considered as 
an imitation of a screening event. 

The growth plate activity experiment show that the new 
yeast promoters are very effective as screening tools in case of 

10 the tested reporter genes. Even in the HE-cellulose substrate 
assay (which is known to be less sensitive than the xylan 
substrate based assay) a significant degradation is seen, contrary 
to the XPR2 yeast promoter based HE-cellulose substrate 
degradation. As seen, neither of the enzymes were expressed at a 

is detectable level when present in the control vector constructs. 
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SEQUENCE LISTING 

SEQ ID NO 1 shows the DNA sequence of the isolated DNA sequence encoding 
yeast promoter activity, and having homology to the EFl-alpha promotor. This 
sequence is identical to the LI. 41 genomic DNA. See figure 6 for further 
details . 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 409 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Yarrowia lipolytica EFl-alpha yeast promoter 

(B) STRAIN: Yarrowia lipolytica 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 



20 


AGAGACCGGG 


TTGGCGGCGT 


ATTTGTGTCC 


CAAAAAACAG 


CCCCAATTGC 


CCCAATTGAC 


-60 


CCCAAATTGA 


CCCAGTAGCG 


GGCCCAACCC 


CGGCGAGAGC 


CCCCTTCACC 


CCACATATCA 


-120 




AACCTCCCCC 


GGTTCCCACA 


CTTGCCGTTA 


AGGGCGTAGG 


GTACTGCAGT 


CTGGAATCTA 


-180 


25 


CGCTTGTTCA 


GACTTTGTAC 


TAGTTTCTTT 


GTCTGGCCAT 


CCGGGTAACC 


CATGCCGGAC 


-240 




GCAAAATAGA 


CTACTGAAAA 


TTTTTTTGCT 


TTGTGGTTGG 


GACTTTAGCC 


AAGGGTATAA 


-300 


30 


AAGACCACCG 


TCCCCGAATT 


ACCTTTCCTC 


TTCTTTTCTC 


TCTCTCCTTG 


TCAACTCACA 


-360 


CCCGAAATCG 


TTAAGCATTT 


CCTTCTGAGT 


ATAAGAATCA 


TTCAAAATG 




3 
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SEQ ID NO 2 shows the DNA sequence of the isolated DNA sequence encoding 
yeast promoter activity, and having homology to the Ribosomal S7 gene yeast 
promoter. This sequence is identical to the L2.17 genomic DNA. See figure 8 
for further details. 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 952 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Yarrowia lipolytica Ribosomal S7 gene 

yeast promoter 

(B) STRAIN: Yarrowia lipolytica 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



20 


TCGACGGATC 


CACTTGTATG 


GCTCCAAGTT 


CAGTGTACCA 


AGTAGTTGGT 


GATGCAGGGA 


-60 




GGGATGTCTC 


TATCCACCAA 


TAATGAACTC 


ATGGGCGAAA 


TTGTTTCTGT 


TAAACACTCC 


-120 


25 


AACTGTCGTT 


TTAAATCTCA 


TTCTCTTTGC 


ATTTGGACTC 


CATTCGCTTC 


CGTTGGGCCA 


-180 


TATAATCCAT 


CGTAACGTAC 


TTTAGATGGA 


AATTTAGTTA 


CCTGCTACTT 


GTCTCAACAC 


-240 




CCCAACAGGG 


GCTGTTCGAC 


AGAGGTAATA 


GAGCGTCAAT 


GGGTTAATAA 


AAACACACTG 


-300 


30 


TCGATTTTCA 


CTCATTGTCT 


TTATGATATT 


ACCTGTTTTC 


CGCTGTTATC 


AATGCCGAGC 


-360 




ATCGTGTTAT 


ATCTTCCACC 


CCAACTACTT 


GCATTTACTT 


AACTATTACC 


TCAACTATTT 


-420 


35 


ACACCCCGAA 


TTGTTACCTC 


CCAATAAGTA 


ACTTTATTTC 


AACCAATGGG 


ACGAGAGCAT 


-480 


CTCTGAGAAC 


ATCGATCTAT 


CTCTGTCAAT 


ATTGCCCAGA 


ATCGTTCGAA 


AAAAAACACC 


-540 




AAAAGGTTTA 


CAGCGCCATT 


ATAAATATAA 


ATTCGTTGTC 


AATTCCCCCG 


CAATGTCTGT 


-600 


40 


TGAAATCTCA 


TTTTGAGACC 


TTCCAACATT 


ACCCTCTCTC 


CCGTCTGGTC 


ACATGACGTG 


-660 




ACTGCTTCTT 


CCCAAAACGA 


ACACTCCCAA 


CTCTTCCCCC 


CCGTCAGTGA 


AAAGTATACA 


-720 


45 


TCCGACCTCC 


AAATCTTTTC 


TTCACTCAAC 


AAACACAAAA 


ATGGCCCGAG 


GACCGTGAGT 


20 


ATCCCCCACC 


CCCCGATCAG 


ATGAGGCACA 


GACCAGGCTA 


GCCCATCGCT 


TTTAGAAGAA 


80 




GGATAAGGGC 


TGTTCTGGGT 


GTGTCAAGAG 


GAGATGATGA 


CGAGAAGCAA 


AGAGCTTCGA 


140 




CTCAGTCGCC 


TCTGCCCCCA 


CGAACTAAAC 


TAACGCCAGC 


AAGAAGCATC 


TC 


192 
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SEQ ID NO 3 shows the DNA sequence and the deduced amino acid sequence of the 
translation factor EF-lalpha cDNA from Y * lipolytica. This is the sequence 
corresponding (downstream) of the EF-lalpha yeast promoter shown in SEQ ID NO 
1. See figure 7 for further details. 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1500 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

( i i ) MOLECULE TYPE : cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: EF-lalpha 

(B) STRAIN: Yarrowia lipolytica 

15 (ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 41 . • 1420 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATCGTTAAGC ATTTCCTTCT GAGTATAAGA ATCATTCAAA ATG GGA AAG GAA AAG 5 5 

Met Gly Lys Glu Lys 
1 5 

25 

ACT CAC GTT AAC CTC GTT GTC ATC GGT CAC GTC GAT GCC GGT AAG TCC 103 
Thr His Val Asn Leu Val Val He Gly His Val Asp Ala Gly Lys Ser 
10 15 20 

30 ACC ACC ACT GGT CAC CTT ATC TAC AAG TGC GGT GGT ATC GAT AAG CGA 151 
Thr Thr Thr Gly His Leu He Tyr Lys Cys Gly Gly He Asp Lys Arg 
25 30 J 35 

ACC ATC GAG AAG TTC GAG AAG GAG GCC GAC GAG CTT GGA AAG GGT TCT 199 
35 Thr lie Glu Lys Phe Glu Lys Glu Ala Asp Glu Leu Gly Lys Gly Ser 
40 45 50 

TTC AAG TAC GCT TGG GTT CTT GAC AAG CTT AAG GCT GAG CGA GAG CGA 247 
Phe Lys Tyr Ala Trp Val Leu Asp Lys Leu Lys Ala Glu Arg Glu Arg 
40 55 60 65 

GGT ATC ACC ATT GAT ATT GCT CTC TGG AAG TTC CAG ACC CCT AAG TAC 295 

Gly He Thr He Asp He Ala Leu Trp Lys Phe Gin Thr Pro Lys Tyr 

70 75 * 80 85 

45 

TAC GTC ACC GTT ATT GAT GCT CCC GGT CAC CGA GAT TTC ATC AAG AAC 343 

Tyr Val Thr Val He Asp Ala Pro Gly His Arg Asp Phe He Lys Asn 
90 95 100 

5 0 ATG ATC ACC GGT ACC TCC CAG GCC GAC TGT GCC ATC CTC ATC ATT GCT 391 
Met He Thr Gly Thr Ser Gin Ala Asp Cys Ala He Leu He He Ala 
105 HO 115 

GGT GGT GTT GGT GAG TTC GAG GCT GGT ATC TCC AAG GAC GGT CAG ACC 439 
55 Gly Gly Val Gly Glu Phe Glu Ala Gly He Ser Lys Asp Gly Gin Thr 
120 125 130 

CGA GAG CAC GCT CTG CTC GCT TTC ACC CTC GGT GTC AAG CAG CTG ATT 48 7 

Arg Glu His Ala Leu Leu Ala Phe Thr Leu Gly Val Lys Gin Leu He 
60 135 140 145 
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GTT GCC ATC AAC AAG ATG GAC TCC GTC AAG TGG TCT CAG GAT CGA TAC 535 
Val Ala lie Asn Lys Met Asp Ser Val Lys Trp Ser Gin Asp Arg Tyr 
150 155 160 165 

5 AAC GAG ATC TGC AAG GAG ACC GCC AAC TTC GTC AAG AAG GTT GGT TAC 583 
Asn Glu lie Cys Lys Glu Thr Ala Asn Phe Val Lys Lys Val Gly Tyr 
170 175 J 180 

AAC CCT AAG TCT GTC CCC TTT GTC CCT ATT TCC GGA TGG AAC GGT GAC 631 
10 Asn Pro Lys Ser Val Pro Phe Val Pro lie Ser Gly Trp Asn Gly Asp 
185 190 195 

AAC ATG ATT GAG GCC TCC ACC AAC TGT GAC TGG TAC AAG GGC TGG ACC 679 
Asn Met lie Glu Ala Ser Thr Asn Cys Asp Trp Tyr Lys Gly Trp Thr 
15 200 205 210 

AAG GAG ACC AAG GCC GGT GAG GTC AAG GGT AAG ACC CTC CTT GAG GCC 72 7 

Lys Glu Thr Lys Ala Gly Glu Val Lys Gly Lys Thr Leu Leu Glu Ala 

215 220 225 

20 

ATT GAC GCC ATT GAG CCC CCC GTG CGA CCC TCC GAC AAG CCC CTC CGA 77 5 

lie Asp Ala lie Glu Pro Pro Val Arg Pro Ser Asp Lys Pro Leu Arg 
230 235 240 245 

2 5 CTT CCT CTC CAG GAT GTC TAC AAG ATC GGT GGT ATC GGC ACA GTG CCC 82 3 

Leu Pro Leu Gin Asp Val Tyr Lys lie Gly Gly lie Gly Thr Val Pro 
250 255 260 

GTT GGC CGA GTC GAG ACC GGT GTT ATC AAG GCC GGT ATG GTT GTT ACC 871 
30 Val Gly Arg Val Glu Thr Gly Val He Lys Ala Gly Met Val Val Thr 
265 270 275 

TTC GCT CCC GCC AAC GTG ACC ACT GAG GTC AAG TCT GTC GAG ATG CAC 919 
Phe Ala Pro Ala Asn Val Thr Thr Glu Val Lys Ser Val Glu Met His 
35 280 285 290 

CAC GAG ATC CTC CCC GAC GGA GGT TTC CCC GGT GAC AAC GTT GGC TTC 967 

His Glu He Leu Pro Asp Gly Gly Phe Pro Gly Asp Asn Val Gly Phe 
295 300 305 

40 

AAC GTC AAG AAC GTT TCC GTC AAG GAT ATC CGA CGA GGT AAC GTT GCC 1015 

Asn Val Lys Asn Val Ser Val Lys Asp He Arg Arg Gly Asn Val Ala 
310 315 320 325 

4 5 GGT GAC TCC AAG AAC GAC CCC CCT AAT GGC TGC GAC TCT TTC AAC GCT 1063 
Gly Asp Ser Lys Asn Asp Pro Pro Asn Gly Cys Asp Ser Phe Asn Ala 
330 335 340 

CAG GTC ATT GTT CTT AAC CAC CCC GGT CAG ATC GGT GCT GGT TAC GCT 1111 
50 Gin Val He Val Leu Asn His Pro Gly Gin He Gly Ala Gly Tyr Ala 
345 350 355 

CCC GTT CTT GAT TGC CAC ACT GCC CAC ATT GCC TGC AAG TTC GAC ACC 1159 
Pro Val Leu Asp Cys His Thr Ala His He Ala Cys Lys Phe Asp Thr 
55 360 365 370 

CTG ATC GAG AAG ATC GAC CGA CGA ACC GGT AAG AAG ATG GAG GAC TCC 1207 

Leu He Glu Lys lie Asp Arg Arg Thr Gly Lys Lys Met Glu Asp Ser 

375 380 " 385 

60 

CCC AAG TTC ATC AAG TCT GGT GAT GCC GCC ATT GTC AAG ATG GTC CCC 12 5 5 

Pro Lys Phe He Lys Ser Gly Asp Ala Ala He Val Lys Met Val Pro 

390 395 400 405 

65 TCC AAG CCC ATG TGT GTT GAG GCC TTC ACT GAG TAC CCC CCT CTT GGT 1303 
Ser Lys Pro Met Cys Val Glu Ala Phe Thr Glu Tyr Pro Pro Leu Gly 
410 415 * 420 
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CGA TTC GCC GTC CGA GAC ATG CGA CAG ACC GTT GCT GTC GGT GTC ATC 1351 
Arg Phe Ala Val Arg Asp Met Arg Gin Thr Val Ala Val Gly Val lie 
425 430 435 

5 AAG TCC GTC GAG AAG TCC GAC AAG GCT GGT GGA AAG GTC ACC AAG GCT 139 9 

Lys Ser Val Glu Lys Ser Asp Lys Ala Gly Gly Lys Val Thr Lys Ala 
440 445 450 

GCC CAG AAG GCT GCC AAG AAA TAAGCTGCTT GTACCTAGTG CAACCCCAGT 1450 
10 Ala Gin Lys Ala Ala Lys Lys 
455 460 



TTGTTAAAAA TTAGTAGTCA AAAACTTCTG AG TT AAAAAA AAAAAAAAAA 



1500 
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SEQ ID NO 4 shows the DNA sequence and the deduced amino acid sequence of 
the ribosomal protein s7 cDNA from Y . lipolytica. This is the sequence 
corresponding (downstream) of the ribosomal protein s7 yeast promoter shown 
in SEQ ID NO 2. See figure 9 for further details. 

(2) INFORMATION FOR SEQ ID NO: 4 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 853 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Ribosomal protein S7 

(B) STRAIN: Yarrowia lipolytica 

15 (ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION : 3 . . 782 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 

AA ATG GCC CGA GGA CCC AAG AAG CAT CTC AAG CGA CTC GCA GCT CCC 4 7 

Met Ala Arg Gly Pro Lys Lys His Leu Lys Arg Leu Ala Ala Pro 
465 470 475 

25 

TCC CAC TGG ATG CTG GAC AAG CTG TCC GGC ACC TAC GCT CCC CGA TCG 9 5 

Ser His Trp Met Leu Asp Lys Leu Ser Gly Thr Tyr Ala Pro Arg Ser 
480 485 "* 490 

30 TCT GCC GGT CCC CAC AAG CTG CGA GAG TCT CTG CCT CTC GTC ATC TTC 14 3 

Ser Ala Gly Pro His Lys Leu Arg Glu Ser Leu Pro Leu Val lie Phe 
495 500 505 

CTG CGA AAC CGT CTC AAG TAC GCC CTG AAC GGC CGA GAG GTT AAC GCC 191 
35 Leu Arg Asn Arg Leu Lys Tyr Ala Leu Asn Gly Arg Glu Val Asn Ala 
510 515 520 

ATT CTC ATG CAG CGA CTG GTC AAG GTC GAC GGC AAG GTC CGA ACC GAC 239 
lie Leu Met Gin Arg Leu Val Lys Val Asp Gly Lys Val Arg Thr Asp 
40 525 530 535 

TCC ACT TTC CCC GCT GGC TTC ATG GAT GTC ATC CAG CTC GAG AAG ACC 287 

Ser Thr Phe Pro Ala Gly Phe Met Asp Val lie Gin Leu Glu Lys Thr 

540 545 550 555 

45 

GGC GAG AAC TTC CGA CTT GTC TAC GAC GTC AAG GGC CGA TTT GCC GTC 3 35 

Gly Glu Asn Phe Arg Leu Val Tyr Asp Val Lys Gly Arg Phe Ala Val 

560 565 ~ 570 

50 CAC CGA ATC ACC GAT GAG GAG GCT GCT TAC AAG CTC GGC AAG GTC AAG 3 83 

His Arg He Thr Asp Glu Glu Ala Ala Tyr Lys Leu Gly Lys Val Lys 
575 580 585 

CGA GTC CAG GTT GGC AAG AAG GGT ATC CCC TAC CTC GTC ACC CAC GAC 431 
55 Arg Val Gin Val Gly Lys Lys Gly He Pro Tyr Leu Val Thr His Asp 
590 595 600 

GGC CGA ACC ATC CGG TAC CCC GAC CCT CTC ATC AAG GTC AAC GAC ACC 4 79 

Gly Arg Thr He Arg Tyr Pro Asp Pro Leu He Lys Val Asn Asp Thr 
60 605 610 615 
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GTC AAG ATC GAC CTG GCC ACC GGC AAG ATC ACC TCT TTC GTC AAG TTT 52 7 

Val Lys He Asp Leu Ala Thr Gly Lys He Thr Ser Phe Val Lys Phe 
620 625 630 635 

GAG AAC GGT AAC ATT GTC ATG ACC ACC GGA GGT CGA AAC ATG GGC CGA 5 75 

Glu Asn Gly Asn lie Val Met Thr Thr Gly Gly Arg Asn Met Gly Arg 
640 645 650 

10 GTC GGC ACC ATC ACC CAC CGA GAG CGA CAT GAG GGT GGC TTC GAT ATC 62 3 

Val Gly Thr He Thr His Arg Glu Arg His Glu Gly Gly Phe Asp He 
655 660 665 

GTC CAC ATC AAG GAC GCT CTT GAC AAC CAG TTT GTT ACC CGA CTC ACT 671 
15 Val His He Lys Asp Ala Leu Asp Asn Gin Phe Val Thr Arg Leu Thr 
670 675 680 

AAC GTT TTC GTT ATC GGT GAG GGC AAC AAG TCT CTC ATC TCT CTG CCC 719 
Asn Val Phe Val He Gly Glu Gly Asn Lys Ser Leu He Ser Leu Pro 
20 685 690 695 

AAG GGC AAG GGT ATC AAG CTC TCC ATT GCT GAG GAG CGA GAT GCC CGA 7 67 

Lys Gly Lys Gly He Lys Leu Ser He Ala Glu Glu Arg Asp Ala Arq 
700 705 710 715 

CGA GCC AAG CAG GAG TAAGTTCAGA TTGGAACAAC ATTGGTTTAG CTAAAAAAAA 822 
Arg Ala Lys Gin Glu 
720 

30 GGATTCATGT TTAAAAAAAA AAAAAAAAAA A 853 



25 
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CLAIMS 

1. A cloned yeast promoter DNA sequence, which comprises 

5 a) the DNA sequence from position -241 to -41 shown in SEQ ID NO 
1 , or 

b) an analogue of the DNA sequence defined in a) which 

i) is at least 90 % homologous with said DNA sequence, or 
10 ii) hybridises with the same nucleotide probe as defined in a) . 

2. The yeast promoter according to claim 1, wherein said DNA 
sequence in a) is from position -407 to -41 in SEQ ID NO 1. 

15 3* The yeast promoter according to claim 1 or 2 , wherein the 
yeast promoter is obtainable from a yeast. 

4. The yeast promoter according to claim 3, wherein a yeast is a 
strain of Yarrowia lipolytica. 

20 

5. The yeast promoter according to any of claims 1-4, wherein 
the yeast promoter is a promoter of the EF-la protein. 

6. A cloned yeast promoter DNA sequence, which comprises 

25 

a) the DNA sequence from -163 to -3 shown SEQ ID NO 2 , or 

b) an analogue of the DNA sequence defined in a) which 

i) is at least 90 % homologous with said DNA sequence, or 

ii) hybridises with the same nucleotide probe as defined in a) . 

30 

7. The yeast promoter according to claim 6, wherein said DNA 
sequence in a) is from position -543 to -3 in SEQ ID NO 2. 

8. The yeast promoter according to claim 6 or 7 , wherein the 
35 yeast promoter is obtainable from a yeast. 



9. The yeast promoter according to claim 8, wherein said yeast 
is a strain of Yarrowia lipolytica* 
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10. The yeast promoter according to any of claims 6-9, wherein 
the yeast promoter is a promoter of the ribosomal protein S7 
gene . 

5 

11. The yeast promoter according to any of claim 1 to 5 or any 
of claim 6 to 10 , which is especially useful for expression 
cloning in yeast, characterised by: 

10 i) having promoter activity in a selective medium, 

ii) having promoter activity in a medium where it is easy to 
purify a secreted polypeptide, e.g. a medium which do not 
comprise degraded protein and 

iii) having promoter activity at least in the pH range from 
15 4 to 11. 

12. The promoter according to claim 11, wherein the medium under 
iii) do not comprise peptone. 

20 13. The promoter according to claim 11, wherein the pH range is 
from 5 to 9 . 

14 . An expression vector comprising an yeast promoter according 
to any of claims 1-13. 

25 

15. An expression cloning method in yeast, comprising 

(a) cloning, in expression vectors according to claim 14, a DNA 
library from an organism suspected of producing one or more 
proteins of interest, 
30 (b) transforming suitable yeast host cells with said vectors, 

(c) culturing the host cells under suitable conditions to express 
any protein of interest encoded by a clone in the DNA library, and 

(d) screening for positive clones by determining any activity of a 
protein expressed in step (c) . 

35 

16. The method according to claim 14, wherein said yeast is a 
strain of Yarrowia. lipolytics. 
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17. A recombinant expression vector comprising 

i) a promotor according to any of claims 1 to 5 or 6 to 10 
or 11 to 13 and, 

ii) a DNA sequence coding for a protein of interest. 

18. The recombinant expression vector according to claim 17, 
wherein said promotor is according to any of claims l to 5 . 

19. The recombinant expression, vector according to claim 17, 
wherein said promotor is according to any of claims 6 to 10. 

20. The recombinant expression vector according to claim 17, 
wherein said promotor is according to any of claim 11 to 13. 

21. A yeast host cell transformed with a recombinant expression 
vector according to any of claims 17-20. 

22. A process for producing a protein in yeast comprising 
culturing a yeast host cell transformed with a recombinant 
expression vector according to any of claims 17-20 under 
conditions permitting production of said protein, and recovering 
the resulting protein from the culture. 

23. The process according to claim 22, wherein said yeast host 
cell is a strain of Yarrowia lipolytica 

24. Use of a produced protein according to claim 22 for various 
industrial applications . 

25. The use according to claim 24, wherein said produced protein 
is an enzyme. 
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ATCGTTAAGCATTTCCTTCTGAGTATAAGAATC^ g Q 

MGKEKTHVNLVVIGHVD 

TGG CGGTAAGTCC ACCAC CAC TGGTCACC ATCTAC AAGTGC GGTGGTATCGATAAGC GAACGATCG AGAAGTTC GAG AAGG AGGC C G A 130 
AGKSTTTGHLIYKCGGIDKRTIEKFEKEAD 

CG AGC TTGGAAAGGGTTCTTTCAAGTACGC TTGGGTTCTTG AC AAGCTT AAGGCTGAGC GAG AGCGAGGTATC ACCATTG ATATTGC TCT 270 
ELGKGS FKYAWVLDKLKAERERG XTID I A L 

CTGGAAGTTG C AGACCCC TAAGTACTAC GTCACCGTTATTGATGCTCC C GGTCAC C GAGATTTCATCAAG AAC ATG ATC ACC GGTACC TC 3 60 
WKFQTPKYYVTVIDAPGHRDFIKNMITGTS 

CCAGGCCGACTGTGCCATCCTC1\TCATT^ 4 5 0 

QADCAILIIAGGVGEFEAGISKDGQTREHA 

TC TGCTCGCTTTCAC CCTCGGTGTCAAGCAGCTGATTGTTGC CATCAACAAGATGGAC TCC GTCAAGTGGTC TCAGG ATC GATAC AACG A 54 0 
LLAFTL GVKQL IVAINKMDSVKWSQDRYNE 

GATCTQCAAGGAGACCGC CAACTTCGTCAAGAAGGTTGG TTACAACCCTAAGTCTGTCCCC TTTG TC CC TATTTCC GGATGGAACGGTGA 630 
ICKETANFVKKVGYNPKSVPFVPISGWNGD 

CAACATGATTGAGGCCTCCACCAACTGT 12 Q 
NMIEASTNCDWYKGWTKETKAGEVKGKTLL 

TGAGGCXrATTGACGCCAXTGAGCCCCCCGTGCG^^ 8 1 o 

EAIDAIEPPVRPSDKPLRLPLQDVYKIGGI 

CGGCACAGTGCCCGTTGGCCGAGTCGAGACCGGTGTTATCAAGGCCG 900 
GTVPVG RVETGV IKAGMVVTFAPANVTTE V 

CAAGTCTGTCGAGATGCACCACGAGATCCTCCCCGA^ 9 90 

KSVEMHHEILPDGGFPGDNVGFNVKNVSVK 

GGATATC CGAC G AGGTAACGTTGCC GGTGACTC CAAGAACGACCC CC C TAATGGC TGC GAC TC TTTCAACGC TCAGGTCATTGTTC TTAA 1080 
DIRRGNVAGDSKNDPPNGCDSFNAQVIVLN 

CCACCCXGGTCAGATCGGTGCTGGTTACGC^ 1170 
HPGQ 1GAGYAPVLDCHTAH IACKFDTL IEK 

GATCGACCGACGAACCGGTAAGAAGATSGAGGACTCCCCCAAGTT^ 1260 
.1 DRRTGKKMEDS PKF XK5G DAA IVK MVPS K 

GCCCATGTGTGTTGAGGCCTTCACTGAGTACCCCCCTCTTGGTC 1350 
PMCVEAFTEYPPLGRFAVRDMRQTVAVGVI 

CAAGTCCGTCGAGAAGTCCGACAAGGCTGGTGGAAAGGTCACCAAG 1440 
KSVEKS DKAGGKVTKAAQKAAKK* 

CAACCCCAGTTTGTTAAAAATTAGTAGTCAAAAACT^ 



Fig. 7 
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BamHl 

TCGACGGATC(^CTTCTATGGCT^ ^537 

CACCAATAATGAACTCATGGGCGAAATTGTT^^ ^ 13 

GCATTTGGACTCCATTCGCTTCCGTTGGGCCATATAATCO\TCGTAACOT -539 

TGCTACXTGTCTCAACACCCCA^ ^5 

ACrGTCGATTTTCACTCATTGTCTTTATGATATTACCTGrrTTT +39 1 

ATCTTCCACCCCAACTACTTGGATTTAC^ -3 17 

C/al 

TAAGTAACTTTATTTCAACCAATCGG^CGAGAGCATCT -243 

GAATCGrT TCG AAAAAAAACACCAAAAGGnT ^GCGCCAT T^ -169 

ATGTCTGTTGAAATCTCATTTTGAGACCTTCCAACATTACCCTCTCTCCCGTCT -95 

TCTTCCCAAAAffiAACACTCCCAACTCTTCCCCCCCGTC AGTI^ AAAG -2 1 
TTCACTCAACAAAC^CAA AAATGGCCCGAGGACC GTGAGTATCCCCCACCCCCCG^^ 

AGGCTAGCOCATCGCTTTTAGAAGAAGGATAAGGGCTGTTaXSGCT 12s 
AAAGAGCTTCGACTCAGTCGCCTCTGCCCOCACGAACT AAACTAACGCCAG CAAGAAGCATCTC - 192 
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AAATGGCCCGAGGACCCAAGAAGCATCTCAAGCGACTCG^ 9 0 

MARGPKKHLKRLAAPSHWHLDKLSGTYAP 

GATCGTCTGCCGGTCCCCACAAGCTGCGAGAGTCTCTGCCTGTCGTC^ 180 
RSSAGPHKLRESLPLVIFLRNRLKYALNGR 

AGG TTAAC GC CATTC TCATGCAGCG ACTGGTCAAGG TCGAC GGGAAGGTC C GAAC CGAC TCCACTTTCCCC GCTGGC TTCATGGATGTCA 270 
EVNAILMQRLVKVDGKVRTDSTFPAGFMDV 

TCCAGCTCG AGAAG ACCGGC G AG AACTTCCG AC TTGTCTAC GAC GTCAAGGGC C GATTTGC CGTCCAC CGAATCAC C GATGAGGAGGC TG 3 60 
IQLEKTGENFRLVYDVKGRFAVHRITDEEA 

CTTACAAGOTCGGCAAGGTCAAGCGAGTCC^ 4 5 0 

AYKLGKVKRVQVGKKGIPYLVTHDGRTIRY 

CC G AC CCTCTCATC AAGGTCAACGACACCG TCAAGATCGAC CTGGCCACC GGCAAGATCACCTC TTTCGTC AAG TTTGAG AACGGTAACA 540 
PDPL IKVNDTVKIDLATGKITSFVKFENGN 

TTGTCATGACCJVCCGGAGGTCGAAACATGGGCCG^ 630 
IVMTTGGRNMGRVGTITHRERHEGGFDIVH 

TCAAGGACGCTCTTGACAACOVGTTTGTTACCCGA 720 
IKD ALDNQFVTRLTNVFVIGEGNKSLXSLP 

agggcaagggtatcaagctctccattgctgagga^ 810 
kgkg ikls iaeerdarrakqe * 

agctaaaaaaaaggattcatgtttaaaaaaaaaaaaaaaaaaa 
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